What Does It Mean to be a Woman in Mathematics?

The following blog post was written by Jessica Matthews, Cooperative Institute for Climate and Satellites (CICS-NC).

room full of women watching presentation

Workshop for Women in Mathematics held April 6-8, 2016.

When offered the invitation to speak at SAMSI’s Opportunities Workshop for Women in Math Sciences, I gladly accepted. When it came time to actually prepare the presentation, I realized that I had never attended, let alone presented, at this type of workshop ever before. I am well versed in putting together a scientific presentation, but this was different. So I myself was faced with the opportunity to consider what it meant to be a woman in mathematics. I had the opening talk time slot, which inherently carries with it the pressure of setting the tone for the entire event. I chose to draw from my personal experiences and to discuss career possibilities beyond the classroom, skill sets I have found necessary (beyond math), and a few key challenges faced by women in our field. A spirited discussion regarding the pay gap and the importance of negotiation entailed. I enjoyed the free-flowing discussion, and felt like this open and welcoming atmosphere was present for the rest of our gathered time.

Throughout the two and half days of the workshop, we had the privilege of hearing from a number of women who have successful careers in academia, industry, and government. They shared their lessons learned, fielded questions, and led discussions about career opportunities and challenges experienced. I cannot possibly capture a comprehensive account of all the great talks and conversations that took place in this workshop, so I provide merely a few personal highlights.

two ladies talking in the hallway

Amanda Goldbeck (R) talking to a participant of the workshop.

Amanda Golbeck introduced the concept of viewing one’s career path as a jungle-gym rather than a ladder. We tend to have the ingrained view of the traditional (and linear) career path, while in reality, to maintain a healthy life–work balance, flexibility is required.  Another grain of wisdom she offered is that being a strong leader is important, but being a valuable team member is paramount. I think this is often forgotten in our power-hungry society, but the truth is that more can be accomplished via cooperation and we should value the cultivation of teamwork skills.

Panel at the women in math workshop

L-R: Ulrica Wilson, Lea Jenkins and Amanda Goldbeck.

Drawing on her experiences at a historically black university, Ulrica Wilson offered a great explanation as to why having workshops such as this one is not only relevant, but important for increasing and maintaining diversity. When we take the time to create this space, we are able to stop focusing on what makes us different and just focus on the math—which is really what we were all drawn to when we chose this pursuit in the first place!

Marie Davidian gave a fascinating overview of notable women in the mathematical sciences, both in the past and the present. I was captivated with the story of the trailblazer Gertrude Cox, founding head of the (then-named) Department of Experimental Statistics at NCSU in 1941. Her recommendation for the position came in the way of a footnote appended to a letter containing a list of recommended male peers: “Of course if you would consider a woman for this position, I would recommend Gertrude Cox of my staff.” This truly puts into perspective how far the community has come with regard to gender equality.

The workshop attendees were energetic and engaged, which made the panel-led discussions and breakout sessions (not to mention breaks) both stimulating and fun. The participants were largely graduate students and early career scientists, who had plenty of thoughtful questions for the expert representatives from academia, industry, and government. Even though I may have been cast as one of the experts, I found that I learned a lot and left the workshop with a to-do list of actions I am interested in taking. In particular: joining a mentor network, engaging more in professional society events, and advocating for family leave benefits.

I am glad to have had this opportunity to consider the challenges, and solutions to those challenges, faced by women and minorities in the mathematical sciences. I’d like to thank SAMSI for hosting this event and allowing us to gather and reflect on both the progress that has been made, and the issues that remain. It is only through this type of directed intention that we may continue to move towards equality.

Advertisements

Taking a Different Road – Being a Statistics Major

The following is written by Sarah Lotspeich, University of Florida who attended the SAMSI Undergraduate Workshop focusing on Computational Neuroscience.

I declared my Statistics major in the eleventh grade, approximately halfway through my AP Statistics course. As everyone around me pondered medical school and the many types of engineering, I knew that my choice seemed unconventional. Now three years into my undergraduate degree, I have met only a handful of fellow Statistics majors to date. During the third week of October, however, this changed forever as I attended the SAMSI Undergraduate Workshop.

Duke Chapel

Duke Chapel.

It was a gorgeous fall day (a pleasant surprise for me, as my typical “fall” in Gainesville, Florida includes a few fallen leaves and a high temperature in the 80s) in Research Triangle Park, North Carolina. Budding statistics and mathematics students from across the country gathered to explore computational neuroscience, and to enjoy fantastic food. Always eager for an adventure, I flew in as early as possible the day before the workshop to get maximum exploring time in Durham. Perhaps a bit TOO eager, I walked over eight miles through Downtown Durham and to both edges of Duke University’s gorgeous gothic campus.

Dame's Chicken and Waffles

Excellent chicken and waffles place!

Fret not, however, as I was well fueled by Dame’s Chicken and Waffles and fondue from the Little Dipper. Needless to say the local area surpassed my every expectation and left me excited to wear scarves and learn more about statistics the following day. The mingling began at approximately 7:30am the next morning, as over thirty of my fellow “numbers people” bonded over bagels and oatmeal. I was so excited to hear from people who care as much about significance tests and p-values as I do!

The presentations commenced with an absolute bang as Dr. Ciprian Crainiceanu of Johns Hopkins University immersed us in “Neurohacking”. He outlined the basic principles of converting MRI images from picture to a system of numbers, and by the end of the hour left us with a data set and the necessary code to explore it independently. One of my favorite components of the workshop, actually, was the interactive nature of each presentation with the integration of R or Matlab code.

Guest lecturers introduced many fascinating facets of computational neuroscience, and I especially enjoyed how my knowledge on the subject compounded with each additional lecture. As the workshop progressed I found that I was relating information from one speaker’s presentation back to material I learned even hours previously, and even today I walked away with a nice basis on the topic. It very much feels as if I went from zero to one hundred with this material, and I appreciate the challenges posed to us by the complicated subject matter.

Beyond the presentations, the field trip to the laboratory for psychiatric neuroengineering at Duke University provided a “behind-the-scenes” glimpse at the processes of data collection that create the massive sets we dealt with during lecture. I was also just happy for any excuse to ogle the beautiful campus once more. Each new speaker and opportunity brought about new questions to ask and facts to learn, so I was happy for the constantly changing environment of the workshop from lecture to lecture, or even breaks for the field trip or panel.

students by SAMSI sign

From left to right: Jordan Zeldin, Eion Blanchard, Sarah Lotspeich, Michelle Zamperlini.

The many bus rides provided unexpectedly pleasant opportunities to meet new people, as well, as I was shuffled into new groups with each trip. I thoroughly enjoyed swapping stories about my university – about the weather, everyday dress code, the statistics department – with people from other schools! And I was even lucky enough to give suggestions about things to do and places to eat in Florida, as one of my new friends is planning a trip to the Sunshine State soon. Perhaps the most unexpected bonus to this experience was the people.

This was honestly one of the most incredible groups of students, and upon learning more about each person and their involvement I am absolutely honored to have been selected among them for the 2015 SAMSI Undergraduate Workshop. Though the workshop lasted only two day, the people I met and research I was immersed in will carry through my entire career. I cannot emphasize enough the importance of this experience and how strongly I recommend it.

There is a 100% probability that I would love to return to SAMSI sometime in the future.

Understanding Droughts – Part of the Undergraduate Modeling Workshop May 17-22, 2015

The following was written by Gabriel Ruiz, attendee from the University of California, Riverside.

attendees sitting listening to lecture

All of the attendees and some of the speakers on Day 1 of the workshop

UG-2

The workshop attendees hard at work.

Just a few weeks ago in May, I was fortunate to be among the 26 undergraduates to attend one of many undergraduate workshops offered at the Statistical and Applied Mathematical Sciences Institute (SAMSI). This was a 5-day-long workshop on mathematical and statistical modeling. The backgrounds of students in attendance ranged from mathematics and statistics, to chemical or aerospace engineering and other fields from universities all across the country. There was also current researchers from SAMSI and other universities in attendance who gave talks on very interesting topics and who led the workshop sessions. Among my favorite parts of this workshop were talks in Bayesian Statistics, Discriminant Analysis, meeting some established researchers, getting to know my peers in mathematics and statistics, the great food we had, and, of course, having the opportunity to visit SAMSI in such a beautiful section of the country.

First Impressions: Raleigh, SAMSI, and NC State

students walking past sign

Attendees as they arrive at SAMSI to kick start the workshop.

cement pathway with trees

The scenic path attendees took to explore NC State and the surrounding area on the first day.

My very first impressions of Raleigh and its surrounding area was how green and pretty everything was. Coming from California, and considering the current drought we are experiencing, this was quite a sight. It was such a relaxing feel.

Students in front of the James B. Hunt Library

Workshop attendees visiting the famous Hunt Library at NC State.

Later on, it was fun meeting with all of the other undergraduate attendees at North Carolina State University, where we all stayed for the next 5 days. In the evening, after some great food, we took a walk around campus and even visited the renowned James B. Hunt Jr. Library. The NC State campus is so beautiful and big! Because of this, we got a little lost but that ended up being a good thing because we were able to see some more of the surrounding area in Raleigh.

The next day, we went to SAMSI on the other side of town for the introduction to what we would be doing throughout the week. We heard from some speakers on interesting topics, and ate some more delicious food. It was nice to get a sense of all the great work that goes on there.

Building on the NC State campus

A scenic example of Raleigh and NC State beauty.

The rest of the workshop was held in SAS Hall at NC State—named after the statistical software company when it was donated by former statistics faculty and founders of SAS Institute Inc.  This building is home to the Mathematics and Statistics departments and was just a light walk from where we were staying. The place we stayed at, I should add, contained a volleyball court that held several competitive games of volleyball among the attendees. This was a fun break after a day of math and statistics.

3 postdocs

Kimberly Kaufeld, Daniel Taylor-Rodriguez and Jyotishka Datta, all postdocs at SAMSI, working together.

There was plenty of informative talks given by researchers from various universities. Among some of the notable talks were given by:

Paul Brooks from Virginia Commonwealth University on “What Causes Shifts in the Human Microbiome.” This talk focused on the Community State Types (CST) of the vaginal microbiome to identify the microbiome profiles that are associated with a high risk of certain diseases as well as devising better predictions for changes in CSTs over time. Students at the workshop were able to work on a subset of this interesting project throughout the rest of week.

Daniel Taylor Rodriguez, a SAMSI postdoc, spoke about his approach to parameter estimation and variable selection of site-occupancy models that use presence-absence data. He presented an occupancy model with probit links and demonstrated his work on deriving more objective parameter priors as opposed to using AIC methods or other Bayesian approaches that require substantially more prior knowledge than is usually available.

Leah Jenkins of Clemson University gave a great talk titled “The Strawberries of Wrath: Farming Under the Realities of Drought”, in which she spoke about the current drought crisis in California—where 80% of the fruits and vegetables consumed in the US come from. The main focus of her talk was describing her and other mathematicians’ role in creating the “virtual farmer” software tool and the team’s use of mathematical modeling and optimization to help farmers in Pajaro Valley, CA remain profitable through current water restrictions. This challenging project was the primary motivation for the second project students were able to work on during this workshop.

Two other SAMSI post-doctoral researchers, Kimberly Kaufield and Yize Zhao, also had hands-on workshops in R, a statistical software, which were very informative to those of us who had limited experience with R. Jyotishka Datta, another postdoc at SAMSI, had a session in which he went over introductory statistical and probabilistic concepts in regression and classification in addition to high-dimensional applications and their implementations in R. A fifth postdoc, Christopher Strickland, went over some very useful approaches to the modeling and data analysis of dynamical systems in Python, as an alternative or complement to R and Matlab.

Among other notable talks were those by NC State PhD student, Neal Grantham, and SAS Institute Data Scientist, Yue Qi. Neal Grantham’s talk focused on the alternative approach to identifying the origin and history of a dust sample through the pollen found in it; the approach uses discriminant analysis and DNA sequencing to identify samples to within a short distance with a measurable degree of certainty as a compliment to a pollen expert’s more subjective identification. Yue Qi’s talk was about the tools he is helping to develop at SAS to more easily analyze “Big Data”, and more specifically he focused on the use of these tools in Machine Learning approaches to fight banking and insurance fraud.

These talks were all of the high quality you would expect at SAMSI, yet were accessible for all of us as undergraduates. After listening to all of these, I hope to learn some more about the research techniques that were discussed and maybe even contribute to the areas in which they have applied these techniques, such as the California drought. It was nice to get a feel about just how broad statistics and mathematics are.

The Workshop: working with a predator-prey dynamical system dataset

For the actual workshop aspect, we were split into groups of 5 that each worked on one of two very interesting topics. The first topic dealt with modeling a predator-prey dynamical system that was meant to be a simplified representation of the more complicated drought situation currently affecting California farms which account for a large portion of US vegetable and fruit supply. The second topic had to do with performing discriminant analysis to differentiate between microbiome states that are defined by the various levels of vaginal microorganisms thought to be higher or lower risk factors for certain diseases as compared to other microbiome states.

Group with mentors

One of the workshop groups alongside their mentors for the week, Daniel Rodriguez (first on the left) and Kimberly Kaufield (furthest on the right)

The dataset I worked with was the predator-prey dataset. We were tasked with first analyzing the time series data we were given on the abundance of three variables: water, plants, and beetles. The key here was to use some sort of time series techniques to model each variable against time. After we were able to find good models for each variable, we could plot the fitted lines of all three to see how they varied over time. The first observation we had was that the densities of each varied over time according to a sine and cosine pattern, so naturally we used a time series model with these properties. The fitted lines further demonstrated that plants had a spike (or dip) in their density whenever there was a spike (or dip) in the water supply. Of course, we know plants depend on water but it was nice to see this graphically over time. There was a very high correlation between these two variables, which helped quantify how strong the relationship was. This relationship is the key characteristic of a dynamical system. Because we had the “noisier” dataset, the same dependency of beetles on plants was not as observable, although it was present.

The next part of the workshop was to develop a system of differentiable equations that brought together all of these relationships. We used the Lotka-Volterra equations, which are also known as the predator-prey equations. The key here was that the parameters and variables needed some tweaking through ODE packages in R, further simulation, and our own intuition in order to best describe the system. This was interesting considering we had three variables to work with: natural resource, a prey, and a predator. The transition from the statistical aspect of this to mathematical modeling was the trickiest part, to say the least, since our group had no real experience with differential equations, much less bridging math and statistics in this way. Luckily, the postgraduates, Drs. Kaufield and Rodriguez, running this workshop walked us through the process and taught us about these equations.

two workshop members giving a talk

Two workshop attendees presenting their findings on the predator-prey dynamical system.

While I am still not completely comfortable with this last aspect, it was important to see the union of statistics and math modeling as a person who is mostly accustomed to the data analysis side. I have already started to look into creating a better system of differential equations this summer. And because I gained curiosity in this type of modeling after the workshop, I am also signed up for some extra math classes on ordinary and partial differential equations for next year and might even take some coursework in dynamical systems somewhere down the line.

Final thoughts: My key takeaways

Coming from California, it was interesting to see just how complicated these dynamic systems involving the seasonality of rain can be. It is important to note that our dynamical system was much more simplified, although still difficult to model with three variables, than the current drought in California. I can only imagine how many variables the analysts involved with this have to deal with, including legislation, people refusing to let their lawns go dry, and the system of aqueducts that go under farmer land which make modeling water levels quite challenging. Although difficult, there are plenty of mathematicians involved in the effort to conserve water in the most efficient way possible, including Clemson University’s Dr. Leah Jenkins who gave a great talk on the topic. I am curious enough from living in a section of California affected by this drought and by attending this workshop to continue to stay in the loop about what mathematicians will continue to do.

Having been in the process of finishing up my second year at the University of California, Riverside studying statistics, this opportunity was an invaluable and eye-opening experience. While I have not been in the world of Mathematics and Statistics for a long time, this workshop sparked curiosity in me about topics I had not yet been acquainted with but would now like to learn more about. For example, this summer, I will almost surely look into developing a better set of differential equations for the predator-prey dataset we were given during the workshop. I would also like to look into the other dataset to learn more about discriminant analysis. I have also come to realize that computational skills are very important. Among my programming to-do list this summer are Julia, Python, and some more R.

Besides the new statistical and mathematical techniques that we learned, I feel the main theme that I have taken away from this workshop is that statistics, math, and computing can all be brought together for meaningful applications in ecology and human health. Moreover, it is refreshing to have experienced first-hand that statistics and math are more than just numbers and equations in a textbook like I had become accustomed to in some of my coursework so far.

It was great to be around a great undergraduate cohort of statisticians and mathematicians who are all at the same point in their careers in this type of environment doing what we love most. The perspective I gained from my peers here, who are all from different universities across the country, about classes to take and interesting research topics is invaluable. To have met some established applied statisticians and mathematicians and listened to their research talks was inspiring. I hope to one day achieve that same level of expertise and fun they are having.

If you are an undergraduate student considering to apply to one of these workshops at SAMSI, I highly recommend that you apply and attend! You won’t regret it!

portrait of Gabriel Ruiz

Gabriel Ruiz.

Recovering from the Epigenetics Workshop

three people talking at the meeting

Michael Zhang (UT Dallas), Zhaohui (Steve) Qin and Shili Lin (Ohio State, co-organizer)

The following is from Zhaohui (Steve) Qin, Associate Professor, Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, who attended the SAMSI Epigenetics Workshop March 9-11, 2015.

I was sitting near my departure gate at RDU Wednesday afternoon, waiting for my return flight after attending the SAMSI workshop on Epigenetics. Suddenly I feel so tired. I have good reason for being exhausted. I feel that my brain has been set on high-spin mode all of the last three days. This is so strange. It is supposed to be a low-intensity meeting. Only a handful of talks and just about 50 attendees. It feels so different from attending other conferences such as ENAR, JSM or ASHG.

First, I know almost everyone at the workshop. For the speakers, I either know them personally, or I know their work. At every break, I barely have time to grab a cup of coffee, not mentioning checking emails. There is always someone I want to talk to within five feet of me wherever I go. And not like the massive conferences, there is plenty of space at the corridor in this cozy SAMSI building. So I feel totally comfortable to join in a conversation.

Inkyung Jung (UCSD), Chenchen Zou (Jackson Lab) and Miriam Huntley (Harvard).

Inkyung Jung (UCSD), Chenchen Zou (Jackson Lab) and Miriam Huntley (Harvard).

Second, there is so much to learn, to talk about and to think. Epigenetics is a hot area these days, new technologies and new findings are emerging almost daily. This is a great opportunity to immerse myself in this exciting field, with so many experts in these areas walking around me. Thanks to Dr. Shili Lin, the set of speakers at the workshop is amazing. A few senior and very experienced scientists plus a large cohort of young and energetic young scientists. In the past three days, I learned several new ideas or results/findings. And I am pretty sure my fellow attendees felt the same way. Everyone is asking each other what’s new. I won’t be surprised if new collaborations were started right at the workshop. I wish more conferences I am going to will be like this one. And I am certain that I will come back to this nice little building when the next opportunity arrives.

Group of people sitting looking at laptops during a break

Yongseok Park (U Pittsburgh), Inkyung Jung (UCSD)

I felt so sympathetic towards my colleagues Karen Conneely and Hao Wu, who have to drive six hours back home. How can someone still have the energy to do that after three long days is really beyond me. I am determined that I am going to sleep soundly during my flight back, no matter how bad the turbulence is.

And I did.

Graduate Students Work on Real-World Problems at 20th IMSM Workshop

The 20th Industrial Mathematical and Statistical Modeling Workshop for Graduate Students (IMSM) just wrapped up its workshop last week. The students met for 10 days and broke into five teams, working with mentors from government and industry on real-world problems.

group shot on stairs

20th IMSM workshop participants and their mentors.

Thirty one graduate students from 28 different institutions participated in this year’s workshop. The first day the representatives from industry and government presented their projects, which ranged from developing a water purification system to finding where a meteor might have crashed in Russia in 2013.

Lincoln Lab group shot

Pictured (L-R) are: Michael Minner, Drexel; Jingnan Fan, Rutgers; Benjamin Levy, U. Tennessee; Het Mankad, U. of Texas at Dallas; Alex Farrell, Arizona State; Hossein Aghakhani, SUNY at Buffalo; Ya-Ting Huang, U. New York-Stony Brook; John Peach, MIT Lincoln Laboratory and Minh Pham, SAMSI.

The “Hunt for Red Hot Rock-tober” group was mentored by John Peach, MIT Lincoln Laboratory, and Minh Pham, SAMSI, included: Hossein Aghakhani, SUNY at Buffalo; Jingnan Fan, Rutgers; Alex Farrell, Arizona State; Ya-Ting Huang, Stony Brook; Benjamin Levy, U. Tennessee; Het Mankad, U. of Texas at Dallas; and Michael Minner, Drexel.

They tried to figure out exactly where a meteor landed that had exploded in an airburst on February 15, 2013 somewhere south of the city of Chelyabinsk, Russia. The group used Bayesian search methods to formulate as many hypotheses they could about what happened to each object assuming that it most likely broke up into several smaller chunks as it entered the atmosphere. For each hypothesis, they constructed a probability density function for the location of each object. The other scenario is that it stayed in one piece and hasn’t been found yet. The group used Google Earth images and created a Google Earth sensor to detect meteor-like shapes. They made a probability map of where chunks of the meteor may have landed, sorted by the highest probability down. They only searched the top 90% and then looked at images before and after the event. They needed to reduce the false alarms, so they converted the images to gray scale and then to binary. They re-grayed the imaged and used a Gaussian blur to detect differences in the before and after images that were round-shaped like a crater would be. This reduced the false alarms from 71 to 27. Seven of these images seemed acceptable, but none of the images they looked at ultimately were craters. They concluded that there was a 57.2% chance that there was no crater in the area.

Army Corps of Engineers group shot

Pictured here from (L-R) are: Benjamin Ritz, Clarkson; Monica Nadal-Quiros, U. of Puerto Rico; Caleb Class, MIT; Tyson Loudon, Colorado School of Mines; Star-Lena Quintana, Temple; Lea Jenkins, Clemson; Matthew Farthing, U.S. Army Corps of Engineers, Fei Cao, Pennsylvania State; and Xiangming Zeng, North Carolina State U.

The group working with Matthew Farthing, U.S. Army Corps of Engineers and Lea Jenkins, Clemson University, on the project entitled “Water purification via Membrane Separation,” included: Fei Cao, Pennsylvania State; Caleb Class, MIT; Tyson Loudon, Colorado School of Mines; Monica Nadal-Quiros, U. of Puerto Rico; Star-Lena Quintana, Temple; Benjamin Ritz, Clarkson; and Xiangming Zeng, North Carolina State U.

They were looking at a way to create the best water purification system. While filtration is typically used to remove a particular contaminant, it can also be used to retrieve valuable components. This would be used for other industries, such as the pharmaceutical industry, or polymer processing. The group used a simulation-based optimization to look at how to improve membrane performance for filtration and separation processes. One of the important applications for this project was to purify water for army personnel in the field who need to reduce pathogens, quickly purify water and reduce the incidence of clogging the membrane. Due to time restraints, the group focused on one-dimensional models, but suggested that future work would use 2-D or 3-D models to better represent the dynamics of the separation process.

CDC group shot

L-R- Isabel Chen, Emory; Christina Edholm, U. Lincoln-Nebraska; Howard Chang, Emory; Simone Gray, CDC; Rachel Grotheer, Clemson; Tyler Massaro, U. Tennessee; Yiqiang Zhen, Purdue.

The “Geographic and Racial Differences of Persons Living with HIV in the Southern United States” group was mentored by Simone Gray, Centers for Disease Control and Prevention (CDC) and Howard Chang, Emory. The group included: Isabel Chen, Emory; Christina Edholm, U. Lincoln-Nebraska; Rachel Grotheer, Clemson; Tyler Massaro, U. Tennessee; Yiqiang Zhen, Purdue.

The group was tasked to quantify the contribution of race and socioeconomic determinants to the overall presence of HIV, particularly focusing on the Southeast. They used the 2010 U.S. Census data and the American Community Survey, along with the DCD’s National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention (NCHHSTP) Atlas and looked at several variables including unemployment, education level, race, urban status, poverty and income at the county level, which included 1,422 counties in 16 states. They used three types of regression modeling including multiple linear, conditional autoregressive, Bayesian Poisson hierarchical mode; non-metric multidimensional scaling and two types of cluster analysis (K-Means and Besag-Newell) to analyze the data. They concluded that the non-Hispanic black ethnicity remained the most important indicator of HIV prevalence rate in the southern United States.

Rho Inc. group

L-R- Yuanzhi Li, Utah State; Hongjuan Zhou, U. of Kansas; Anastasia Wilson, Clemson; Tamra Heberling, Montana State; Nancy Hernandez Ceron, Purdue; Augustin Calatron, Rho Inc.; Alexej Gossmann, Tulane; and Herman Mitchell, Rho Inc.

Another group worked on the “Allergy, Asthma and Exposures in the Homes of the US Population” problem. The group, mentored by Agustin Calatroni, Herman Mitchell and Russ Helms of Rho, Inc. and Sanvesh Srivastava of SAMSI, included: Alexej Gossmann, Tulane; Tamra Heberling, Montana State; Nancy Hernandez Ceron, Purdue; Yuanzhi Li, Utah State; Anastasia Wilson, Clemson; and Hongjuan Zhou, U. of Kansas.

From 1980-2012, cases of asthma in the U.S. has increased by 171% . Allergies and asthma cost about $56 billion a year. An extensive study called the National Health and Nutrition Examination Study (NHANES) was conducted in 2005-06 to develop a prediction model for asthma based on allergies and exposures in the home. They surveyed about 10,000 people to determine the prevalence of major diseases and the risk factors for those diseases. Rooms in the participant’s homes were vacuumed to collect dust samples. The students used logistic regression, LASSO regression and random forest models to examine the data. They concluded that the random forest models had the highest accuracy rate for prediction.

SAS group

L-R-Kenny Lopiano, SAMSI and Duke; Obeng Addai, Youngstown State; Shrabanti Chowdhury, U. California at Riverside; Piaomu Liu, South Carolina; Mark Wolf, SAS; Karianne Bergen, Stanford; Xin Huang, U. Texas at Dallas and Fatena El-Masri, George Mason.

Another group worked on the “Analysis of Self-Reported Health Outcomes Data ” project. The group that was mentored by Mark Wolf, SAS, and Kenneth Lopiano, SAMSI and Duke, included: Fatena El-Masri, George Mason; Karianne Bergen, Stanford; Obeng Addai, Youngstown State; Piaomu Liu, South Carolina; Shrabanti Chowdhury, U. California at Riverside and Xin Huang, U. Texas at Dallas.

This group looked at self-reported health outcomes data from web based media sources. Usually clinical outcomes are derived from surveys of patients and formal reports from physicians when a side effect occurs from taking a drug, for example. However, many people are on forums, bulletin boards and social media outlets talking about drug-related or health-related data that gives more instantaneous feedback about how a drug may be performing. Text mining techniques are very important to get this kind of feedback. The group used SAS Enterprise Miner to parse, filter and identify topics in each document they examined. They proposed a set of methods taking advantage of SAS Text Miner to break the words up into nouns, verbs, adjectives, etc. They then used a filter to decide whether to keep or drop the word, and then had the program classify the word into a category. They looked at author interactions and applied a page rank algorithm. They then conducted a sentiment analysis to gather any emotion around the posts and then took out the useless posts and just kept the ones that seemed to be noteworthy. They looked at topics trending to see if there was increased chatter on a topic using a burst detection method, then used a Markov model to analyze the inter arrival gaps.

To get a much better understanding of the work that was conducted during this workshop, read the final report here.

 

Apply Now for SAMSI Undergraduate Modeling Workshop

Undergraduate students take note! SAMSI is taking applications for a unique, week-long opportunity to explore mathematical and statistical research in data modeled using networks. Talks will be presented by statisticians and mathematicians who work with networks, particularly focusing on social networks.

Many communication mediums, such as face-to-face conversations, text messaging to Facebook or Twitter make modern social networks complex and exciting systems to study.  Students will look at things such as how an individual’s attitudinal, behavioral or health characteristics are altered as a result of interacting with others.

For a good part of the week, students will be in teams and will use data from the Social Evolution experiment in the MIT Human Dynamics Lab to investigate a variety of questions related to the formation and evolution of social networks using data from approximately 100 students in a college dormitory during the 2008-2009 academic year.

Students will spend most of the week on the campus of North Carolina State University in Raleigh, North Carolina.

Hurry as the deadline to apply is April 7, 2014 at 5pm.  More details and the application can be found here.

Predicting number of landfalls of hurricanes — Undergraduate Modeling Workshop produces forecasts for 2013

group shot of undergraduates attending May 2013 workshop

Undergraduate workshop from May 2013.

Thirty-four undergraduate students from around the U.S. came to SAMSI and NC State University the week of May 13-17. During the week, the students interacted with an atmospheric scientist who works on hurricane research, and applied mathematicians and statisticians who work on climate research.  Students used the same database as used at NCSU to forecast various aspects of future hurricane seasons, and built Poisson regression models within R to produce their own forecasts of the 2013 hurricane season in the US. Below are some comments from participants:

three students with signs

Corey Raphael, U. Florida, Jonathan Skantz, U. Florida and Gwen Tian, U. British Columbia.

Corey Raphael, University of Florida
“I had a great time during my week at SAMSI! I learned all about climate science and hurricane predictions, and met a lot of great people. Thanks for all the advice and free food! I enjoyed getting to know the Raleigh area, and I learned a lot about R that I didn’t know previously. I hope the program enjoyed having me as much as I enjoyed being here!”

Group 3 shot

Evan Bittner, Penn State, Kasey Palmquist, UNC Wilmington, and Daria Drozdova, Pomona College.

Kasey Palmquist, University of North Carolina at Wilmington
“The workshop was an excellent experience; I truly feel that I am not leaving empty-handed. I not only learned new methods of statistical analysis, but how to collaborate with a group of people on a research topic. I found this workshop beneficial because it allows undergraduates to get a “feel” of mathematical/statistical research in order to see if it is right for them. I found the workshop to also be a great way to network and meet people that share the same interests as you. Overall, great experience!”

Group 6 SAMSI undergraduate modeling workshop May 2013

Brandon Sherman, U. Pitt, Kehao Zhu, Purdue, and Vinicius Taguchi, NCSU

Vinicius Taguchi, North Carolina State University
“This workshop was a wonderful experience.  I gained a better appreciation for statistics and applied mathematics, made lasting friendships, and got to see a new side of NC State University.  When I first got here, I was a little concerned about being one of the few non-math/stats majors, as well as one of the very few underclassmen.  Nevertheless, this never became an issue and I felt like part of the group right from the get-go.  Thank you, SAMSI.”

Group 2 photo SAMSI undergraduate modeling workshop May 2013

Lee Richardson, U. Washington-Seattle, Charles Ho, Rice and Anna Peris, Marquette.

Lee Richardson, University of Washington at Seattle
From his Twitter feed – “Predicted a Poisson Distribution with a mean 3.96. AKA 56% chance of greater than 4 hurricanes!!!!!”

Here are some of the presentations that the students gave the last day of the workshop.

Impressions from the Undergraduate Workshop on Data-Driven Decisions in Healthcare

big group of students outside SAMSI

February 2013 Undergraduate Workshop participants.

SAMSI recently held the Undergraduate Workshop on Data-Driven Decisions in Healthcare for about 30 students. Visiting professors, postdoctoral fellows and graduate fellows who are participating in this SAMSI program led the sessions providing cutting-edge research into the lectures. Students had a chance to work with data from the SEElab at Technion in Israel, got an overview of personalized medicine and a tutorial in R and a demonstration of the ARENA software.  Here are a few of the students’ impressions from the workshop.

Eric Laber instructing students

Eric Laber, NCSU, giving lecture at the workshop.

Eric Kernfeld, Tufts University Class of 2014, Applied Mathematics

“I had a great time at the workshop on Data Driven Decisions in Health Care this past weekend. It was a nice opportunity to meet statisticians, something I don’t get the chance to do back at Tufts. I also met a lot of undergraduates majoring in statistics and mathematics. The food was good, the staff were welcoming, the accommodations were convenient, and the talks were well-pitched. I recommend SAMSI workshops to anyone who’s interested in the topics, especially to people considering graduate education down the road.”

Danielle Llanos, Georgetown University

“I thought the SAMSI workshop was wonderful. It was a great opportunity to learn from talented individuals, and a chance to expand my network. The lecture topics were incredibly interesting and were very relevant to my career goals. Probably the best part of the workshop was the graduate student panel. The ability to ask those burning questions and learn from the experiences of others was great. I would recommend any SAMSI workshop to students looking to learn more about opportunities in the sciences, and expanding their educational experiences.”

three students at table

Students networking at lunch.

Brittany Boribong, sophomore, biomathematics major at University of Scranton

“As a student with no background in statistics and programming, I found the workshop a bit overwhelming but no less interesting. Coming into this with no experience just allowed me to take that much more out of the workshop.  I was able to explore new fields of math that I never considered before and learn about topics that I had no idea even existed. As a Biomathematics major, I found the topic of using data to derive decisions in healthcare intriguing since it is an application of my major that I was not aware of. Another wonderful aspect of the workshop was the chance to speak to people in different fields. During lunch, I had the opportunity to speak to a post-doc fellow and during dinner, I spoke to one of the professors that gave a lecture earlier in the day; these opportunities don’t come along every day. It was enjoyable hearing their stories and being able to have a casual conversation with them. The panel made up of current graduate students and post-docs was also helpful in that they were able to share their experiences about graduate school and offer along any advice. I found it particularly helpful since one of the speakers was currently in a biomathematics program and I was able to ask questions I had about my major.

However, the best part of the workshop, in my opinion, was being to meet other students. Coming from a university with a smaller math department, I really enjoyed meeting students from around the country with interests similar to my own. It was great being able to make connections with students in different fields and from universities from all over. Overall, I had a wonderful time meeting new people and exploring different fields of mathematics during the workshop and found this to be a great experience.”

Apply Now for the 2-Day Undergraduate Workshop at SAMSI October 26-27

group of undergraduate students from 2011

Last year’s undergraduate workshop group.

SAMSI is accepting applications for the two-day undergraduate workshop that will focus on Statistical and Computational Methodology for Massive Datasets. The workshop will be held October 26-27 at SAMSI in Research Triangle Park, NC. The program begins at 9:30am on Friday, October 26 and ends at noon on Saturday, October 27.

Applications received by Friday, September 28 will receive full consideration. SAMSI will reimburse appropriate travel expenses, including food and lodging. Participants are urged to arrive on Thursday evening.

The Statistical and Computational Methodology for Massive Datasets program focuses on fundamental methodological questions of statistics, mathematics and computer science posed by massive datasets, with applications to astronomy, high energy physics, and the environment. Serious challenges posed by massive datasets have to do with “scalability” and “data streaming.”