SAMSI Special Guest Lecture – Bruce Pitman: Where Are You Gonna Go When the Volcano Blow?

Pitman Headshot

Bruce Pitman is a Professor in the Department of Materials Design and Innovation School of Engineering and Applied Science at the University of Buffalo.

One of the major hazards of volcanic activity is inundation by debris flows, block and ash flows, and hot pyroclastic flows. These can run out many kilometers, travelling at speeds of 50 m/sec or more. A challenge for volcanology is to predict the chances of a flow inundating a specific location – that is, what areas are at risk of suffering a hazardous flow event.

A group of scientists from geology, engineering, mathematics, and statistics has been studying this problem for more than 15 years. Several of the scientists came together during the 2006-2007 SAMSI program on “Computer Models”, and have been collaborating ever since.

After lying dormant for more than 300 years, the Soufriere Hills Volcano on the island of Montserrat began an eruptive phase in 1996. Hundreds of mass flows have occurred during the last 17 years, ranging in size from O(104) m3 of material

to 200 × 106 m3 of material. Several of the largest of these flows have caused tremendous damage to population centers on the island, to the extent that today more than half the island has been evacuated.

Using mathematics to derive model equations describing the flowing mass, and the sophisticated computer solver TITAN2D, scientists can simulate mass flows at Soufriere Hills, assuming certain input parameters. To calibrate the effort, each computer simulation currently takes about 20-60 minutes on a 16-processor parallel cluster.

BV_hazard_maps

This figure shows contour from flow thickness computations, showing safe and hazardous zones along the Belham valley on Montserrat.

By combining flow data and expert opinion, one can develop a probabilistic model of the severity and frequency of flow events. The naïve approximation to assessing the hazard probability, namely sampling this distribution to generate inputs and running those inputs through the simulator to estimate the percentage of these

runs that ‘hit’ that location, is not feasible because of the rarity of catastrophic events and the expense in running the simulator.  Furthermore, this approach ties the expensive simulator runs to a specific probabilistic model which may prove antiquated as new data and information become available.

Our group introduced a new twist: a statistical emulator — a computationally cheap response surface approximating the output of simulations — is constructed, based on carefully chosen computer model runs. The speed of the emulator then allows us to ‘solve the inverse problem’: determine regions of inputs values (characteristics of the flow) which result in a catastrophic event. Then the probability that a catastrophic event will occur at a particular location can be quickly calculated under any probability distribution of inputs. With a careful arrangement of computer simulations and emulator constructions, we can calculate the probability of a catastrophe at many locations simultaneously, producing a hazard map like that shown in the figure. This map shows the hazard regions near the Belham valley, under four different, potential eruption scenarios. Civil protection officials have set “trigger points”, sites at which flows of a certain thickness initiate evacuation of particular neighborhoods. Our hazard maps can provide additional information to these officials, and help refine the trigger and evacuation workflow.

_DSC0013

Pitman taking questions from the group at the conclusion of his lecture on how statistics is being used to address potential hazards from volcanic activity.

By the way, the Volcano song by Jimmy Buffett was written in the late 1970s, about the Soufriere Hills Volcano. The volcano had been dormant since the 17th century, and erupted in 1996. Volcano was recorded at AIR Studios on Montserrat, a recording facility designed by Sir George Martin and opened in 1979. AIR Studios were destroyed by Hurricane Hugo in 1989.

SAMSI Postdoctoral Research on Ocean leads to Academic Career in Statistics

mikael_kuusela_photo_2016

Mikael Kuusela, former Postdoctoral Researcher at SAMSI. Currently serving on a tenure-track as an assistant professor in the Department of Statistics and Data Science at Carnegie Mellon University.

Life has been good to Mikael Kuusela. A native of Helsinki, Finland, his has been a long journey to becoming a scientific researcher.

His curiosity and passion for new things, new people and new environments, led him to pursue Bachelor’s and Master’s Degrees in Engineering Physics and Mathematics from Aalto University in Finland, and then a Ph.D in statistics from École Polytechnic Fédérale de Lausanne (EPFL) in Switzerland.

Kuusela is one of a long line of postdoctoral researchers who have attended SAMSI. Like his experience here, others have found a direction to their dream of one day working in a field of study involving applied mathematics, statistics or many other disciplines of data science.

“Mikael’s performance was superb during his time at SAMSI, and he has contributed to the active and lively research atmosphere at SAMSI by his participation at the seminars, workshops, working groups and other research and outreach activities,” said Elvan Ceyhan, Deputy Director of SAMSI. “He was not just focused in his research but also striving for SAMSI to be a center of activity and interaction.”

Kuusela attended SAMSI through a joint, climate-based program through the Statistical Methods for Atmospheric and Ocean Sciences (STATMOS) research network and SAMSI. After spending one year at the University of Chicago as part of his postdoctoral experience, he arrived at SAMSI in 2017. Kuusela continued research he began at the University of Chicago, which focused on improving statistical methods for analyzing Argo float data.

“SAMSI enables postdocs to form a nationwide network of professional connections and to gain experience on cutting-edge interdisciplinary research.”

“The main reason I chose this position was that it provided an opportunity to work on an extremely interesting application: the measurement of global ocean temperatures and salinities using data from Argo floats,” said Kuusela.

Argo Float data is used to measure temperature and salinity in the upper 2000 meters of the global ocean. By finding better ways to analyze these data, Kuusela and his colleagues hoped to show the prevalent evidence of environmental change in our world. This research required statisticians and oceanographers to work together in order to identify the best statistical techniques for studying these environmental changes in the ocean.

In support of the SAMSI Program on Mathematical and Statistical Methods for Climate and the Earth System (CLIM), Kuusela helped set up and coordinate the Statistical Oceanography Working Group, comprised of some of the world’s leading oceanographers and statisticians. One of their aims was to study and estimate the amount of heat in the global ocean and develop time series models to graphically depict where these trends occur and thus, where efforts need to be focused in order to mitigate environmental change.

_DSC0151

Mikael Kuusela gives a talk during the Postdoctoral Fellow 2018 Spring Session. Kuusela was a second-year postdoc researcher at SAMSI, using his time to research better ways to analyze data involving ocean temperature variables.

“We are especially interested in the rate at which the global ocean is warming up and in the month-to-month variations in the heat content. Accurate estimation of these quantities is incredibly important since almost all of the heat that is trapped in the Earth’s Climate System will eventually end up in the ocean where it shows up as increased ocean heat content,” said Kuusela.

Kuusela confesses each day at SAMSI was new and exciting and often led him to collaborating with colleagues all over the world, sometimes in a single day!

“Some days are full of meetings. Sometimes I’ve had morning meetings with researchers in Europe, lunch meetings with people on the East Coast and afternoon meetings with collaborators on the West Coast. Since SAMSI work is almost by definition collaborative, it tends to involve a lot of travel. Sometimes I feel like I’ve spent as much time on the road as I did at SAMSI,” he said.

In all, Kuusela pointed to the numerous contacts he developed over his time working at SAMSI as being some of the most valuable results of his success. “Thanks to all of these interactions, you not only get to work on interesting interdisciplinary research projects, but also end up with a valuable network of professional connections. The possibility of networking and collaborating with researchers from various institutes across the country is indeed one of the great benefits of being a SAMSI postdoc,” he said.

Kuusela noted that he was comforted by the fact that he ran into numerous former SAMSI postdocs who are now faculty members at some of the nation’s leading institutions during his interview process for academic positions.

“A SAMSI postdoc can expect plenty of opportunities for engaging and collaborating with leading researchers in their field and beyond. In many ways, SAMSI is ‘the CERN of Statistics,’ a central hub of activity that enables researchers to interact in ways that would not otherwise be possible,” said Kuusela. “SAMSI enables postdocs to form a nationwide network of professional connections and to gain experience on cutting-edge interdisciplinary research.”

He also talked about what postdocs who attend SAMSI in the future gain from the postdoctoral research experience.

“SAMSI postdoctoral experience is clearly highly valued in today’s academic job market and it is easy to see why: SAMSI postdocs gain experience in working on complex applied and computational problems in statistics and mathematics in a collaborative and interdisciplinary environment.”

_DSC0089

Mikael Kuusela leads an instruction tutorial on R Software for a SAMSI-sponsored undergraduate workshop. During Kuusela’s second year as a postdoc, he routinely gave talks and participated in several workshops, where he had the opportunity to collaborate and mentor undergraduate and graduate students and also gain valuable contacts for future collaborations with fellow researchers in his field.

Kuusela has now joined the Department of Statistics and Data Science at Carnegie Mellon University in Pittsburgh, PA, as a tenure-track assistant professor. While teaching, he will also continue to focus on his research on statistical data analysis in the physical sciences.

SAMSI is one of eight mathematics and science institutes, funded by the National Science Foundation, whose aim is to inspire, enhance and prepare the next generation of applied mathematicians, statisticians, and computer and data scientists for the future.

Mikael Kuusela is now enjoying the fruits of his labor and getting to do what he is most passionate about – helping to develop better data analysis methods to solve scientific problems that impact our planet and society.

To see postdoctoral research opportunities at SAMSI, visit: https://www.samsi.info/postdoc-jobs.

Reflections on SAMSI’s 2018 Undergraduate Modeling Workshop

Alex Hayes

Alex Hayes, Statistics Major, Rice University

The following is an extract from the Blog of Alex Hayes.  To view the entire piece, visit his blog.

I spent the last week at Statistical and Mathematical Sciences Institute’s (SAMSI) undergraduate modeling workshop. This year the workshop was hosted at North Carolina State University (NCSU) in Raleigh, NC.

The Rundown…

 About thirty students attended the workshop. To get in there’s a mellow application process. SAMSI covered travel, rooming and food for the participants. We were expected to bring laptops with R and RStudio installed. The purpose of the workshop was to give undergrads experience modelling real world data. Each year the workshop has a different theme, in our case statistical analysis of climate phenomena.

Before the workshop, we choose from a list of six projects for the week. On Sunday night, we flew in for a welcome dinner and met the other students on our project team. Each group had a SAMSI postdoc as group leader.

On Monday Doug Nychka and Chris Jones gave us a broad overview of the statistical issues present in climate science. We spent the afternoon doing some team building activities, discussing our interests, what skills we brought to our respective groups and developing research questions.

We spent the next three days working on our projects. We probably spent six hours a day modeling, and an hour or so at a research presentation or R workshop, and an hour goofing off and hanging out. The talks in particular were very good, presenting current research at the undergrad level in an engaging way.

In the evenings a small group would normally explore the bars in the NCSU area, which was nice after a long day on campus. The workshop concluded on Friday, when each group presented their findings before flying out in the afternoon.

“Personally, the workshop enabled me to make some valuable connections within the stats community.”

The Workshop…

 My group was led by Mikael Kuusela, who did a fantastic job helping my group find research questions. He gave us a ton of individual feedback and was very attentive and patient. I particularly appreciated his advice on choosing questions that scientists care about.

Personally, the workshop enabled me to make some valuable connections within the stats community. At the end of the workshop, Mikael asked me if I’d like to write up a short outreach piece based on my project with him, which I’m super excited about. Keep an eye out for an upcoming piece on a functional decomposition of ocean thermoclines during El Niño (feat a plot we’re calling The Bananafold).

Earlier this year Maggie Johnson, another SAMSI postdoc, put me in contact with some of the bioinformatics crew at Pacific Northwest National Laboratory and I nearly ended up taking a year off to work on omics projects with them.

I also had a blast getting to know Doug Nychka. Not only was Doug super patient with my many newbie questions about GAMs and splines, it was fun to chat with him about climbing and the UW-Madison statistics program.

 

Courtesy of Alex Hayes_at_samsi

Alex Hayes (middle) is joined by fellow Rice University Data Science Club members at the 2018 Undergraduate Modeling Workshop, May 21-25. Hayes influenced several of his fellow club members to apply for the worthwhile SAMSI workshop. (Photo courtesy of Alex Hayes)

Perspective on the Undergrad Stats Community…

As someone who’s spent a bunch of time organizing undergrad statistics activities over the last year, the workshop was an interesting opportunity to learn about the broader community of statistics undergraduates. Here are some of the notes I took.

We have fundamental misconceptions about the purpose of modeling: When groups presented their initial research questions, it was immediately clear that many students were conflating descriptionprediction and causation. Throughout the week, there were many attempts to turn everything into a prediction problem, or to interpret descriptive analyses as causal.

The pre-requisite stack is not very deep: Most students had taken a mathematical statistic course, but very few had much coursework beyond that. Less than half the workshop had background in linear regression, and people were much less comfortable with linear algebra than I would have expected. Barely anyone had probability or analysis background.

Programming skills are rate determining: We dramatically overestimate our R capabilities. In particular, non-tabular data really threw people off. My group took about three days to calculate mostly summary statistics and make basic plots.

Everybody’s resume looks the same: I’ll write more about this soon, but everybody advertises themselves in exactly the same way. This is despite having wildly varying skillsets. As a job seeker, how do you demonstrate that you are on the upper end of the competency spectrum? As a recruiter, how do you differentiate between candidates who look identical?

That’s a Wrap…

 We learned things at a great workshop. Everyone should go if they get a chance. The statistics community should spend more time teaching beginners about the big picture: what statistics is and how we should use it.

Rutgers Undergrad Challenged to Succeed at Modeling Workshop

Riya Prabhaudes1

Contributed by: Riya Prabhudesai, Math and Physics Major, Rutgers University

The SAMSI Undergraduate Modeling Workshop was an amazing learning experience that I will never forget. A series of collaborations with postdoctoral scholars and a diverse group of undergraduate students, lectures given by field experts, educational workshops, and a poster session, created for a greatly productive week for me.

Getting Started

The first few days were more intense and packed with educational material than I had initially expected. We received an in-depth yet broad overview of the field of climate science, as well as mathematical and statistical methods that helped analyze the problems in climate change. This was accomplished through workshops that dealt with coding in R software, as well as presentations given by the program’s coordinators and guest speakers.

I appreciated that despite being undergraduate students, our inexperience in the field did not translate to an inability or ineptitude in the mentors’ and coordinators’ eyes. This empowered us to take on challenges that seemed overwhelming and unattainable in the span of a week — I was never told that I did not have the ability or brain power to solve a problem. Not only were the postdocs and program coordinators extremely encouraging, but the undergraduates that I worked with and talked to were helpful in any way they could be, and came to the workshop with a desire to learn.

Being in this environment motivated me to delve deeper into the project I was working on with my group throughout the week, while also giving me a list of papers and topics in math that I wanted to learn more about when I got back home.

_DSC0177

Rutgers University Undergraduate, Riya Prabhudesai, presents her research for her group project during SAMSI’s Undergraduate Modeling Workshop from May 21-25. Her group presented statistical research on vegetation through data captured by remote sensing methods.

Hard Work Pays Off

Although the workshop did prove to be intense and tiring at points, there was room for downtime with other students in the program. We ventured into downtown Raleigh area a few nights, as well as went on a few short walks around the NCSU campus itself. At the end of the week, we were able to exercise our presentation and communication skills through a 25-minute research presentation documenting the work we had done and problems we had solved throughout the week.

While the presentation in and of itself was a difficult task to finish, it served as a memento of all the work we had put in throughout the week. Through the course of a week, I learned how to implement various statistical and mathematical methods in R, and apply these techniques to analyzing complex climate systems. I caught a glimpse of the intricacy and thoroughness scientific research requires, and would recommend this workshop to anyone that has a vested interest in the mathematical sciences, as well as the subject material the workshop puts out every year.

Caltech; SAMSI Co-Sponsor Remote Sensing Workshop at JPL

JPL_Caltech-Welcome sign

Photo courtesy of NASA Jet Propulsion Laboratory

SAMSI directorate and researchers from around the world attended the Remote Sensing, Uncertainty Quantification and Theory of Data Systems Workshop, held at NASA’s Jet Propulsion Laboratory (JPL) in Pasadena, CA from Feb. 12-14, 2018.

This was a co-sponsored event in conjunction with researchers from Caltech and the purpose was to bring applied mathematicians, computer scientists, experts in remote sensing technology, and Climate and Earth System scientists together. The researchers reviewed, discussed and planned research on issues related to large-scale data analysis of distributed data using spatial statistical methods.

Attendees were treated to numerous talks on subjects like data system architectures, distributed access and analysis of large scale data and multi-layer modeling methods. Understanding these principles are vital in this research.

Remote sensing, used by the USGS and other research organizations, is the process of detecting and monitoring the physical characteristics of an area by measuring its reflected and emitted radiation at a distance from a targeted area. Data is captured from sources such as cameras on satellites and planes, sonar systems and many more. The data helps to determine physical changes in the Earth such as temperature variance and ocean levels. This information gives researchers a better understanding of how our Earth is changing and evolving.

This remote sensing information is then entered into spatial statistical algorithms to run estimates on present and future changes. This information in many cases can be quite massive and thus stored in different physical locations. The workshop helped to discuss how to store this important data and how best to share it with multiple entities across multiple platforms accurately and efficiently.

This productive two-day workshop produced much discussion and the participants went away with a better understanding of how to further this research.

SAMSI co-sponsors several workshops each year in order to develop key partnerships with academic institutions and leading edge research organizations, like NASA, to bring the most up to date information to those who participate in their events. This workshop was just one more way that SAMSI supports the advancement of statistics, applied mathematics and computer science in order to innovate the future.

To find out what was discussed at this workshop, visit the webpage: https://www.samsi.info/remote-sensing-caltech.

jpl-lab

The Jet Propulsion Laboratory (JPL) is a federally funded research and development center and NASA field center, located near the California Institute of Technology (Caltech) campus. SAMSI, in conjunction with Caltech, co-sponsored the Remote Sensing, Uncertainty Quantification and Theory of Data Systems Workshop here Feb. 12-14, 2018.
Photo courtesy of NASA Jet Propulsion Laboratory

SAMSI Prepares Postdocs & Grads for Future Careers

Scott Morgan, a communications and marketing expert and owner of the Morgan Group, speaks at a recent SAMSI Professional Development Workshop (PDW). Morgan shared tips and tools with postdocs and graduate students about what they need to know about creating resumes, CV’s and interviewing skills for science and math jobs.

Scott Morgan, a communications and marketing expert and owner of the Morgan Group, speaks at a recent SAMSI Professional Development Workshop (PDW). Morgan shared tips and tools with postdocs and graduate students about what they need to know about creating resumes, CV’s and interviewing skills for science and math jobs.

SAMSI is known for their workshops and special events promoting the math and sciences. You may not know, however, that they also work diligently to prepare postdoctoral fellows and graduate students with tools for future success?

Since Sept. 2017, SAMSI has hosted multiple Professional Development Workshops (PDW’s) geared toward helping new professionals find and get future careers in math and science.

The PDW series is hosted at the SAMSI institute and features a two-hour talk with question and answer session, followed by a lunch.

The series of lectures features SAMSI Directorate, professionals from business, industry, government and even specialists in the field of communications and marketing. All of these short talks help those interested in science and math careers to promote and catalog their research, create CV’s and resumes and even give them practical tools such as interview preparation. There have been five events in the series thus far.

Huang Huang, a SAMSI 2017-18 Postdoctoral Fellow, appreciated the benefits of the workshop series because it provides him with skills like building a strong CV and/or resume, interview preparation, and time management. Huang thinks the tips he has learned thus far in the workshop series will help lead him to a future career in math and science.

SAMSI PDW’s help grow and shape the next generation of young professionals in the fields of math and science. For those visiting or local to the SAMSI institute area and would like to attend future events, view the PDW web page for more information on current or past workshops: https://www.samsi.info/pdw.

PDW Panel

Three members from governmental and industrial research organizations (left to right – William LeFew, a data scientist from Metabolon; Elizabeth Mannshard, a statistician from the U.S. EPA; and Fang Chen, a Senior Research Statistician from SAS) compose a panel that answer questions from graduates and postdoctoral fellows at a SAMSI Professional Development Workshop (PDW) on Nov. 15, 2017. The panelists discussed the paths they took to get jobs in their fields of statistics and data science. SAMSI PDW’s help to guide graduates and postdoctoral fellows to the next stage of their professional careers.

10 Minutes With: Hyungsuk Tak, SAMSI Postdoctoral Fellow

We recently took a moment to   connect with one of our busy   Postdoctoral Fellows, Hyungsuk Tak. We took ten minutes and asked him ten questions…This is what he had to say:


1. What made you decide to pursue a future in mathematics? Who has inspired you the most in your career thus far?

Tak: I have loved mathematics since my college days! But I did not like doing math for the sake of mathematics. I know that it is exciting for some people, but to me, it was boring. Instead, I wanted to use mathematics to solve real-world problems. In this sense, Statistics was PERFECT FOR ME. However, I had no intention to pursue a Ph.D. until I met Professor Carl N. Morris in the Harvard Statistics Department. He helped me experience statistical research and this motivated me to transfer into the Ph.D. program. During my Ph.D., I met Professors Xiao-Li Meng and David van Dyk who introduced Astro-Statistics to me. This later became my career path and it is why I am working in the ASTRO Program here at SAMSI.

2. Where did you grow up and where did you go to school?

Tak: I grew up in Seoul, South Korea. I also received my undergrad degree there. I then moved to the USA for my masters and Ph.D. in Statistics at Harvard.

 

“I personally believe that mathematical ability comes from the power of thinking, the power of thinking comes from imagining something, and imagining something comes from reading books.”

 

3. When you aren’t tackling complex math equations or doing research, what do you enjoy doing in your spare time?

Tak: I mainly do three things in my spare time: 1) reading; 2) exercise; and 3) surf the internet. For instance, in the morning I always read a book (written in Korean) while I have breakfast, for about 30 minutes – I like historical and classical novels more than contemporary ones. If I find the book really interesting, then I often read the book after I get home until I go to bed. I also do exercise for about two hours on Tuesday, Thursday, and Saturday. When I was a child I suffered from tuberculosis, so health is the most important thing in my life. When I get home, I often spend most of my time surfing the internet and get caught up with current events by reading Korean news articles. I really enjoy seeing how our new Korean president, Moon Jae-in, is doing? He is the person I voted for in the recent Korean Presidential Election.

4. What were some of the reasons you decided to apply for a SAMSI Postdoctoral fellowship?

Tak: There were three reasons: 1) SAMSI post-docs have a great amount of freedom in doing research because SAMSI allows post-docs to work with any professors or researchers at Duke, UNC, NCSU, and/or any other universities in the USA; 2) SAMSI is the place where domain scientists visit (physically or remotely) for collaborations, which means there are plenty of opportunities to learn new things. Finally, SAMSI post-doc fellows receive a generous salary for a post-doc position in Statistics.

5. What are some of the things that have intrigued you about the SAMSI program you are supporting this academic year?

Tak: I currently serve in the ASTRO program. The most intriguing thing is that I have been fortunate enough to have the opportunity to meet and work with astronomers who have brought interesting and realistic problems to SAMSI during the workshops or weekly meetings. All my current research, that was initiated after I came to SAMSI, is based on solving these realistic problems.

6. What program or workshops will you be supporting in the 2017-2018 academic year? Are you looking forward to any new research coming up?

Tak: I am continuing my research in Astro-Statistics rather than start new research in other fields unless there is a program closely related to my current research in terms of methodology.

7. How are you enjoying living and working in North Carolina?

Tak: When I landed at the RDU airport (from Boston), I saw, from the airplane, that N.C. is full of trees. Everything I saw through the window in the airplane was green with almost no buildings – I immediately loved this nature-friendly environment. I saw a fox (or coyote) and I have seen many deer around SAMSI; one day three deer were standing next to the entrance! I really enjoy N.C. for the nature-friendly lifestyle. I also enjoy sometimes hiking and walking trails.

8. When your time is over at SAMSI, what will you miss the most and why?

Tak: I will miss the people at SAMSI the most. For example, post-docs, administrative officers, directors, graduate and faculty fellows, visitors, and custodians. Since I spend most of my time on the SAMSI campus, plus the fact that the institute is a little isolated (surrounded by woods), even a short and small interaction with people at SAMSI has been invaluable and memorable to me.

9. What are your plans for the future? Do you see yourself working in academics or business/industry and why?

Tak: I am going to apply for a tenure-track position at an academic institution in the US this winter. If it does not work out however, then I will start looking for industry jobs early next year. I may not do a second post-doc.

10. What advice and/or guidance would you give to other undergraduate/graduate students interested in working in mathematics?

Tak: I recommend reading as many books as possible. I personally believe that mathematical ability comes from the power of thinking, the power of thinking comes from imagining something, and imagining something comes from reading books. Again, this is not based on a causal inference but based on my personal belief (prior information that can be biased!).

Undergrad Workshop Helps Student See Bright Future in Applied Math and Statistics

My name is Victoria Sabo and I am a mathematics and Spanish double major at Georgetown University. I am very interested in applying math to problem solving in the real world, such as using programing and data in security, population modeling, analyzing businesses, or even tracking supermarket inventory to minimize product waste.

My research interests are why I applied to the SAMSI undergraduate workshop from May 14-19. The workshop gave me the opportunity to apply mathematics and computer science to realms usually isolated from the sciences. Based on the description, I imagined being exposed to new applications of math, stats, and computing while having the opportunity to harness my mathematics knowledge to solve an actual problem that I may not have known could be solved using the skills of a mathematics major. In the end, I gained ample skills, both academic and professional, and I was able to test them out while working on my own group research project.

David Jones, SAMSI Postdoctoral Fellow, presents information on the Light Curve Project to students at the Institute for Advanced Analytics on the campus of North Carolina State University. The instruction was part of SAMSI’s week-long Interdisciplinary Workshop for Undergraduate Students, May 14-19, 2017.

After dedicated postdocs presented the overviews of six projects, we were allowed to rank our top choices:

  1. Lightcurve Classification for Periodically Varying Stars (Light Curves Project)
  2. Distributionally Robust Stochastic Programming for Financial Applications (Finance)
  3. Finding Exoplanets Using Radial Velocity Data (Exoplanets)
  4. Automatic Genre Classification of Music Pieces (Music)
  5. Time Delay Estimation for Gravitationally Lensed Light Curves (Time Delay)
  6. Data Assimilation for Numerical Weather Prediction (NWP)

I was fortunate enough to receive my first choice which was the Automatic Music Genre Classification project. That meant for the entire week, I would work on a team to investigate algorithms used for supervised learning, where training data taken from a music dataset, to be used to create a system for predicting the genres of unlabeled songs.

When we first met in groups, we discussed how to read the data and began thinking of probability techniques common to machine learning that would be useful for the task. We read scholarly articles about previous approaches to the problem, then met the following day to begin coding programs based on our dataset.

“I was pleasantly surprised at the diversity of the attendees at the workshop. The backgrounds of the students ranged from civil engineering, to a double major in math and piano…This variety in background facilitates the sharing and cross-pollination of ideas from different fields, which I deeply appreciated.”                                                                                                                – Kevin Multani,  Applied Science, Department of Engineering Physics, University of British Columbia – Vancouver, Canada

An undergraduate student presents the findings of her group’s project during SAMSI’s Interdisciplinary Workshop for Undergraduate Students held on the campus of North Carolina State University May 14-19, 2017.

As the week went by, we experimented with different combinations of song features, such as loudness, danceability, and song_hotttnesss (no, not a typo), and various techniques. The techniques, used for coding the data, aimed at achieving the highest accuracy in song genre classification. The techniques included: k-means clustering; k nearest neighborhood; Gaussian classifiers; PCA; and t-SNE. Through this process it was very interesting to note the limitations on our research and how the attributes, such as the data set qualities or the time constraint, affected what we could accomplish. Overall, this research project introduced me to what it was like to work on a team to conduct formal research. I also enjoyed spending the week bouncing ideas off of my other group members as we worked to solve a problem found at the intersection of two distinct subjects: math and music.

Besides just the experience of working in a research group, I created lifelong memories from this workshop thanks to the incredibly intelligent people I had the pleasure of meeting. I was introduced to undergraduates from across the United States and Canada, many of whom had international backgrounds as well. Everyone possessed a unique skill set, from their university, when it came to computer programing. The diverse backgrounds of every participant contributed to the success of the research project because of the various courses taken by the undergraduate students. I loved hearing about everyone’s majors and their career goals. I found it was invaluable to be able to exchange advice with people who were as interested in the sciences as myself. During meals and breaks, we would discuss our intended graduate study goals as well as past research we had conducted thus far. I was given advice on what conferences to attend and which schools were best for certain master’s degree or Ph.D. programs. I definitely have reevaluated my future plans since conversing with and listening to such a wide range of science and math students.

One of my peers in the workshop, Kevin Multani, an undergraduate student from the University of British Columbia – Vancouver, Canada had similar points to share:
I was pleasantly surprised at the diversity of the attendees at the workshop. The backgrounds of the students ranged from civil engineering, to a double major in math and piano — there was even a student who was double majoring in Philosophy and History (of Mathematics)! This variety in background facilitates the sharing and cross-pollination of ideas from different fields, which I deeply appreciated. Most of my learning came from discussion and conversation with the students and mentors. In fact, through conversation with my mentor, David Jones, I’ve gained a solid understanding on what to expect for graduate school. Overall, the SAMSI Undergraduate Workshop was a refreshing experience, both personally and academically.

Even though the friends that I made during the week were an enriching part of this SAMSI undergraduate workshop experience, the panels and talks organized for us also made an impact on me academically. We received information on North Carolina State University’s master’s program for science in analytics, since the Institute for Applied Analytics, where the event was hosted, is located on the university’s campus. I came out of this workshop with a broader understanding of the great career opportunities in data analytics. Thanks to the talk from Michael Rappa on opportunities in data analytics and his program within the institute, my eyes were opened as to how many different applications of data analytics there are for people with those skills. For instance, I had never considered that someone with a math background was needed to calculate the appropriate amount of supermarket inventory to prevent over and under stocking? Likewise, I did not know that companies hired analysts to evaluate their businesses in order to maximize the efficiency of their hiring efforts. Due to my interests in applying math to real world problems, I am now going to focus my efforts on exploring this area as a possible career path. I am also looking forward to augmenting my computer programming skills because I recognize now, that for these types of jobs, coding and programming, in addition to a solid linear algebra and classical mathematics background, are essential skills for the type of work in which I am interested.

A group of students prepares for their project presentation during SAMSI’s Interdisciplinary Workshop for Undergraduate Students, May 14-19, 2017. The workshop required students to work in multiple groups and present findings on assigned subjects.

I entered this SAMSI workshop as a mathematics major, but I lacked the knowledge of how I could put that degree to good use applying math knowledge to real world problems. After the workshop however, I have now conducted research in an application of math to music; something I never imagined was possible!

I was also introduced to countless other opportunities available for individuals trained in math, computer science, and analyzation techniques. I feel that by taking more courses geared towards applications of math in the real world, I can better prepare myself to succeed in a career in data analytics. Additionally, I am now informed on what it takes to create a successful application to graduate school and which programs I should consider that will best prepare me for a productive and fulfilling future.

Therefore, this undergraduate research workshop not only provided me with research, public speaking, and teamwork experience, but it also educated me on what options exist for my future. Although I have much more to think about, SAMSI was a starting point in helping me determine where I would like to see myself in the coming years and helped to catalyze the best way for me to utilize my mathematics and computing knowledge to benefit others in the future.

Undergraduate students from across the nation pose for a group shot during SAMSI’s Interdisciplinary Workshop for Undergraduate Students, May 14-19, 2017.

SAMSI Brings Astronomers and Statisticians Together to Study Universe

Contributed by:

Jim Barrett,
Ph.D. student, School of Physics & Astronomy, University of Birmingham, UK
Maya Fishbach,
Ph.D. student, Department of Astronomy and Astrophysics, University of Chicago, USA
Bo Ning,
Ph.D. candidate, Department of Statistics, North Carolina State University, USA
Daniel Wysocki,
Ph.D. student, School of Physics & Astronomy’s Astrophysical Sciences & Technology program, Rochester Institute of Technology, USA

The four of us are graduate students who have come together from different universities and a variety of disciplines: Jim Barrett studies astrophysics in the University of Birmingham’s School of Physics & Astronomy, Maya Fishbach studies astrophysics at the University of Chicago, Bo Ning studies statistics at North Carolina State University, and Daniel Wysocki studies astronomy at Rochester Institute of Technology. We came to know each other by attending the astrophysical population emulation and uncertainty quantification workshop held by SAMSI. This workshop is one of a series workshops in a one-year long program on Statistical, Mathematical and Computational Methods for Astronomy (ASTRO). We enjoyed the experience we had in this workshop titled Astrophysical Population Emulation and Uncertainty Quantification which was held at SAMSI.

This was a very hands-on workshop which provided all of us with wonderful opportunities to sit down and have face-to-face discussions with fellow researchers from a variety of university backgrounds. Given the fact that we are from different disciplines (3 of us in Astrophysics and Astronomy and 1 in Statistics) and different universities (including one from overseas), it wouldn’t have been possible for us to meet and collaborate until we got to SAMSI at this workshop. So, thanks to SAMSI and National Science Foundation (NSF) for supporting us and bringing this exciting opportunity for us to collaborate and meet with so many eminent researchers in Astrophysics, Mathematics and Statistics, clearly, an opportunity which would perhaps last lifelong in our career.

Background
As we share our experiences in this workshop, please allow us to first explain the context of this workshop. The workshop lasted for one week, held from April 4 -7, 2017. The theme of this workshop was to discuss using fast emulators to generate population models from various fields in astrophysics, including exoplanets, gravitational waves and extragalactic astronomy. Ilya Mandel, a professor at the University of Birmingham, and Derek Bingham, a professor at Simon Fraser University, organized this workshop.

During the first day of the workshop, researchers gave short presentations about their working projects in the morning, and four working groups were formed in the afternoon. The first group’s purpose was to discuss “the population emulation of massive binary stars,” and “the population of exoplanets.” The second group focused on “estimating statistical density functions for the population of gravitational wave sources.” The third and fourth groups focused on “Gaussian process (GP) model inference.” One of these groups focused on setting up GP models and coding them into python notebooks, while the other focused on building a GP emulator into a Hierarchical Bayesian model.

After the working groups were formed, the group members spent the majority of their time discussing new ideas and working on preliminary results throughout the remaining days of the workshop. Besides these group discussions, three tutorial lectures were given by Derek Bingham and Earl Lawrence on the second and third days of the workshop. These tutorials introduced computer model emulators, especially the GP model, discussed model calibration, and gave an overview of how to choose different strategies for the design of computer experiments.

During the workshop, four of us attended different groups and had different experiences. In the rest of this blog, we would like to share our individual experiences and takeaways from this workshop.

“I have been interested in astronomy since I was young, but I never dreamed about the day that I would be able to work side-by-side with astronomers, using statistics to solve their problems.”Bo Ning

Group Analysis

Jim Barrett:

Jim BarrettWorking Group I

I came to the workshop with my supervisor Ilya Mandel from the University of Birmingham in the UK. We work on modelling the evolution of binary stars, and in particular the kind of systems that could potentially become gravitational wave sources. We are actively developing a rapid population synthesis code, which simulates the entire lifetime of a binary star in a fraction of a second. This allows us to generate vast populations of binaries, so that we can use statistics to study the population as a whole.

In particular we are interested in how we can use gravitational wave observations to challenge the assumptions we make in our simulations. However, this is highly challenging, since gravitational wave systems are so difficult to make, we typically need to simulate tens of thousands of systems to get just one gravitational wave source. We therefore came to the SAMSI workshop to get help and advice on building an emulator for our model.

DSC_0256

Ilya Mandel, a professor of Theoretical Astrophysics from the University of Birmingham (UK) discusses data capturing methods during the Astrophysical Population Emulation and Uncertainty Quantification Workshop at SAMSI on April 3-7, 2017.

We spent the week engaged in many stimulating and fruitful discussions with the statistics experts and fellow astrophysicists. We discussed the best approaches to building an emulator and spent many hours talking about experiment design. We left the workshop with a solid plan for how to proceed with or emulation problem and eager to continue to collaborate with the workshop participants in the future.


Maya Fishbach:

fishbach-mayaWorking Group IV

After attending the ASTRO opening workshop in August, I was excited to return to SAMSI. My research interest is to learn about populations of black holes from analyzing gravitational wave data, so I had joined Working Group 4 at the opening workshop. At one of the working group’s weekly telecons, Sujit Ghosh, SAMSI Deputy Director, presented his research with Angie Wolfgang and Bo Ning into using Bernstein polynomials to estimate the joint mass-radius density for a population of exoplanets. After email discussions with Sujit, Angie and Bo, I was inspired to further explore statistical methods for density estimation that could also be applied to populations of black holes. Thus, while my goal for this workshop was to explore density estimation techniques, I knew that I would encounter new ideas along the way that would inspire new and unanticipated projects.

For example, in initial discussions with statisticians Sujit, Bo and Ji Meng Loh, the problem of selection effects kept coming up. In the case of gravitational waves, massive compact binaries are louder than less massive ones, and so we are more likely to detect them. Therefore, when inferring the mass distribution over the population of compact binaries, it is critical to account for this selection effect that prefers massive binaries. Fortunately, Tom Loredo, Ilya and Daniel Wysocki had previously thought a lot about how to incorporate selection effects when analyzing populations of astronomical objects. This led to a large fraction of the astronomers and statisticians spending an afternoon listening to and discussing their results. Open problems remained in the case where the selection effect was not known precisely. For example, Leslie Rogers thought about how to define the selection probability for exoplanet mass and radius measurements – because the mass and radius are measured by different surveys. In addition, Kaisey Mandel, was working on defining selection effects for supernova surveys. The topic of selection effects is fundamental in astrostatistics, and it was very useful to discuss known methods of incorporating selection effects in a population-level analysis as well as challenges that remain.


Bo Ning:

Bo NingI have been an active participant in the Astro program at SAMSI since the opening workshop began in August, 2016. Since then, Sujit Ghosh, Angie Wolfgang, and I have been working on a project, using a nonparametric method for estimating the mass and radius relationships of exoplanets. Previous studies focused on making parametric assumptions based on the power-law model. However, these assumptions are somewhat arbitrary and often fail to hold true. As a result, we are using a more flexible model to estimate exoplanets’ mass and radius relationships.

This workshop provided Sujit, Angie and I with the opportunity to meet and to have face-to-face discussions on details of model inference. After the workshop ended, Angie and I spent an extra week working on our project. Our outcomes from the past two weeks were huge. For example, we finished the outline of the paper draft and obtained some preliminary results. We also sorted out our future plans and possible cooperation after the end of this workshop.

In the meantime, during the workshop, I also had discussions with Maya Fishbach and Daniel Wysocki about their project on gravitational waves. Even though this topic is quite different from modeling mass and radius relationships for exoplanets, the nonparametric model Sujit, Angie and I used in exoplanet was also useful to solve some of their problems, which was very exciting.

I would like to thank SAMSI for providing a great opportunity for interdisciplinary cooperation. I have been interested in astronomy since I was young, but I never dreamed about the day that I would be able to work side-by-side with astronomers, using statistics to solve their problems. Through my participation in this program, and by attending this workshop, I learned a lot about how to apply statistical models to solve problems in astronomy.


Daniel Wysocki:

DWysocki1After the incredible learning experience I had during the ASTRO opening workshop last August, I was pleasantly surprised that the Astrophysical Population Emulation and Uncertainty Quantification workshop surpassed it. As a 2nd year astrophysics Ph.D. student, I am working on methods to constrain the properties and origins of the population of compact binary objects responsible for the gravitational waves observed by the Laser Interferometer Gravitational-Wave Observatory (LIGO). By working with astronomers and statisticians working on problems from different domains, but with similar statistical challenges, I gained a much deeper understanding of the fundamental concepts and problems underlying the statistics relevant to my research.

One subject I gained a great deal of insight into was dealing with selection effects. Since I work with gravitational wave observations of individual binaries, all of my inferences on the population have to account for the fact that we’re more likely to detect massive objects due to the resulting increased signal strength, as well as a number of other biases. I came to appreciate how easy I have it after Eric Ford described a problem that depended on the number and types of planets orbiting each star; incredibly challenging considering we may never see some of those planets. Many questions I had on selection effects going into the workshop were cleared up, and I even discovered an error in an essential equation in a paper I’ve been writing.

In addition to the effective mix of people, I also thought the number of people attending the workshop hit a sweet spot. There were enough people to keep good diversity in skill-sets, but it was also a small enough number that I got to meet the majority of people attending, which is a hard balance to meet.

A new collaboration was started as a result of this conference, between Dr. Sujit Ghosh, Bo Ning, Maya Fishbach, and myself, which will come to fruition over the coming months. In describing the related astrophysical problems Maya and I are working on, Sujit came up with an alternative approach utilizing copulas, which I was unaware of beforehand. We will be working on a paper where we apply this type of method, and compare its performance with the approaches we’ve taken in the past.

As the ASTRO program comes to a close, I’m sad to see it go. Since the opening workshop, and the many SAMSI teleconferences I have attended throughout the year, I have learned a great deal about the general field of astrostatistics, and now understand the major statistical challenges being faced across the many branches of astronomy. I hope to find myself back at SAMSI for similar programs in the future.

Drexel PhD Candidate Gains Perspective on Big Data in Astronomy at International Workshop

Contributed by: Jackeline Moreno, Physics Ph.D. Candidate, Drexel University

Contributed by: Jackeline Moreno, Physics Ph.D. Candidate, Drexel University

I am a fourth year Graduate Student at Drexel University. My research area is optical AGN variability and accretion physics.  However, attending workshops like this one and participating in a SAMSI ASTRO working group, has expanded my interest to other types of variable objects and time series signatures.  I enjoy thinking critically about how these characterizations relate to physical properties of objects grouped in the same hyperplane of parameter space.

Our community of astronomers, statisticians and physical scientists are excitedly anticipating the era of time domain astronomy and, our new lens for probing the distant universe, gravitational wave detection.  The SAMSI-ICTS workshop (Time Series Analysis for Synoptic Surveys and Gravitational Wave Astronomy) made a pioneering effort to bring together experts from seemingly different research fields in order to find common ground to exchange techniques and insights for analyzing time series data.  The workshop was hosted by the International Centre for Theoretical Sciences (ICTS) in Bengaluru, India. ICTS and SAMSI worked together to arrange speakers to present interesting content, coordinate for meals, handle logistics for the workshop and manage transportation for outings to explore the city. Special thanks are owed to ICTS as they went above and beyond assisting with visas, travel, accommodations and in orchestrating the 4-day workshop.

James Long, Asst. Professor of Statistics from Texas A&M University, gives a talk during the Time Series Analysis for Synoptic Surveys and Gravitational Wave Astronomy in Bengaluru, India. The four day workshop was held at the International Center for Theoretical Sciences (ICTS) and was a co-sponsored workshop with SAMSI.

James Long, Asst. Professor of Statistics from Texas A&M University, gives a talk during the Time Series Analysis for Synoptic Surveys and Gravitational Wave Astronomy workshop in Bengaluru, India. The four day workshop was held at the International Center for Theoretical Sciences (ICTS) and was a co-sponsored workshop with SAMSI.

The speakers presented on various topics, such as: variability statistics for classification in surveys; domain adaptation; noise modelling; and a whole slew of methodologies used to study the physics of transients, periodic and aperiodic variables and binary candidates for GW detection and localization. Speakers emphasized critical issues that needed improvements or further investigation. These issues were framed in the form of challenges to facilitate possible projects for collaboration. Talks were followed by panel discussions.  Several participants suggested that future similar workshops should provide allotted time for hacking or coding in conjunction with the panel discussions.  There was also an effort to document the challenges in an Authorea document, to serve as a discussion board afterward.

SAMSI workshops and working groups have helped me understand how my thesis work fits into the larger scientific picture and how to gain a better understanding of what our science priorities are as a community of observational astronomers.” 

All of the talks were video recorded, so visitors can view the talks, participants and abstracts of the presentations. In addition, photos and links to the webpage at SAMSI are also provided. SAMSI was a proud co-sponsor of this event and, in the future, they look forward to supporting research events like this in an international community setting. Sessions between panel discussions were organized into the following broad topics:

  1. Outliers and Background
  2. EM follow up of GW events
  3. Science of Transients, and
  4. Techniques for Time Domain Astronomy

A few talks that stood out to me included Rafael Martinez‘s (Associate Scientist at the Harvard-Smithsonian Center for Astrophysics) talk on “Building a Training Set for an Automatic LSST Lightcurve Classifier.” He talked about combining different classifiers, the problems with miscellaneous labels containing the largest number of objects and problems with period finding algorithms.  Hyungsuk Tak, a SAMSI postdoc, also gave a very nice talk, “Robust and accurate inference via a mixture of Gaussian and terrors,” and he asked the question why do astronomers so often and automatically assume Gaussian distributed errors? He presented a very promising method he developed combining Gaussians and heavy tailed (t-distributed) error models and demonstrated that the accuracy of inferred parameters improved significantly.  Another talk I enjoyed was Kuntal Misra‘s (Scientist of the Aryabhatta Research Institute Observational Sciences [ARIES] in Naintal, India). She talked about “Gamma Ray Bursts and Associated Supernovae”.  She provided a comprehensive discussion of lightcurve and spectral features used to classify and characterize these objects.

Participants of the Time Series Analysis for Synoptic Surveys and Gravitational Wave Astronomy pose for a group shot at the International Center for Theoretical Sciences (ICTS) in Bengaluru, India. The group was composed of astronomers, astrophysicists and statisticians from all over the world.

Participants of the Time Series Analysis for Synoptic Surveys and Gravitational Wave Astronomy workshop pose for a group shot at the International Center for Theoretical Sciences (ICTS) in Bengaluru, India. The group was composed of astronomers, astrophysicists and statisticians from all over the world.

From my perspective, as a fourth-year graduate student, I found the SAMSI workshops to be very eye-opening because they gave me so much context about sophisticated and efficient methodologies that work well with different data sets.  They provided a briefing on the latest and greatest techniques being applied to astronomical data in a setting conducive to discussion, cross-discipline education, and collaboration.  SAMSI workshops and working groups have helped me to understand how my thesis work fits into the larger scientific picture and to gain a better understanding of what our science priorities are as a community of observational astronomers.

I’m excited to see where these applications of machine learning take us?  In the future, I’d like to see more applications of hierarchical clustering and other techniques that capture continuity between subpopulations within a broader class.  These methods might help us transition into this massive (time series) data era to better understand our observations as dynamic systems but also in an evolutionary context.

This conference was not only great because of the science and stats. The location and the people who attended made it an unforgettable experience for me! Both ICTS locals and people invited through SAMSI were genuinely welcoming and kind folks. In the evenings after the workshop we all had dinner together, went for bike rides and played some ping pong.  After the workshop, I was invited to join a group touring the central part of Bengaluru and the archaeological sites at Hampi.  The days that followed were an adventure, and I sincerely appreciated the moments I shared with the great friends I made through this workshop!

Participants take a break from the Time Series Analysis for Synoptic Surveys and Gravitational Wave Astronomy workshop to explore Bengaluru, India. The four-day workshop was held at International Centre for Theoretical Sciences (ICTS) and featured speakers in the field of astronomy from around the world.

Participants take a break from the Time Series Analysis for Synoptic Surveys and Gravitational Wave Astronomy workshop to explore Bengaluru, India. The four-day workshop was held at International Centre for Theoretical Sciences (ICTS) and featured speakers in the field of astronomy from around the world.