(*Note*: *This is the fourth and final part in** a series about graduate life in statistics, co-written by Mike and Greg*. *For links to all articles in the series, click here)*.

There are a lot of things you don’t learn in graduate school that you probably should have.

Here’s our perspective, using our programs (Mike @ Brown + UMass, Greg @ UConn and WPI) and the experiences of our peers in other programs as baselines.

**Data science**

Not many statistics programs cover this kind of thing in their curriculum, but It’s a pretty important skill. In our view, in modern statistics, if you can’t collect or organize data you’re essentially useless. In the “real world”, no one (or very rarely) is someone going to give you a rectangular data file with no missing data and say “Analyze this.”

As a result, two elements of data science that are of particular importance for statisticians to be familiar with are web scraping and data storage. Web scraping can be used to instantaneously access terabytes of useful (or useless) data from the internet, while data management programs like sql servers allow for easier access and storage of large data sets. To answer questions for your dissertation, to land and excel in a summer internship (especially one analyzing data from the internet), and to prepare for a job or research after academics, statisticians will not only need to know how to analyze data, but how to collect it and store it properly.

Most programs don’t teach these topics, and several students end up learning them on their own.

That should change.

**How to explain your science**

Former NBA player Allen Iverson is (in)famous for this rant on practice; ‘*We’re talking about practice. I mean listen, we’re sitting here talking about practice, not a game, not a game, not a game, but we’re talking about practice.’*

Iverson’s point was that because he had practiced so many times in his career, missing a practice in preparation for a game meant nothing. That, or he just wanted to remembered for a really long time on YouTube.

Graduate students aren’t so lucky. Our games – coming in the form of talks, presentations, etc – come at most once or twice a year. As a result, practice takes on an even larger role. Iverson would value the practice before a game a little bit more if he only had one game every six months. Moreover, whereas experienced academics have honed their presentation skills over time, graduate students have never had that time. They should.

One of our adviser’s discussed a program in his graduate school that mandated biannual presentations for students in front of the entire statistics department, both faculty and classmates. Nearly every doctoral student in the country would benefit from this type of practice (no matter how terrifying it may be). While students might find it annoying at first, they’d benefit in the long run. And faculty would also do well to learn about the research their students are doing, at least as far as preventing the awkwardness that develops when they ask a complicated question at the student’s dissertation defense, which, in many programs, is the first time they’ve seen that student talk.

All this takes is faculty commitment. The results, however, are aplenty. Students will learn both how to talk statistics and how to teach statistics. They’ll learn that no one can follow their entire notation, and as a result, that complicated notation probably wasn’t needed to begin with. They’ll learn how to speak clearly and concisely, and that 20 minute talks shouldn’t last 23.5 minutes because it’s really annoying both for the crowd and for the person speaking last. They’ll learn that crowding a slide with too much information generates eye rolls, and that the beginning and ends of the talk mark its most important points as far as audience interest.

Practicing talks also makes great preparation for actual job talks or conference talks, too; the questions you hear can lead to better presentations, better research, and better answers to questions when they inevitably get asked again.

For Masters students that might not be expected to present at conferences, presentations should still become larger parts of a program’s curriculum, too. At UMass, for example, statistics chair Michael Lavine runs a seminar series in which students present in front of their peers on a weekly basis. It provides students their first run at both presenting graduate level statistics and using LaTeX for slides. You can find out more about this program in an article Mike helped write for Chance Magazine.

**In general, people won’t understand what you are doing in grad school**

When I (Greg) was in college I took a class called rings and fields. The textbook was called something like “Modern Algebra”. When I was home, my dad saw the book and said, “Algebra? I took algebra. Aren’t you a math major?” It’s not the same thing. You’re going to experience this with statistics. Here is a sample conversation:

Person: “What do you do?”

You: “I’m in graduate school”

Person: [Interested] “Oh what are you studying?”

You: “Statistics”

Person: [Cringing] “Like batting average?”

You: “Yeah. Sort of.”

[Silence]

Then the person will tell you about how much they hated statistics in college. Simply tell them that that means more jobs for you.

And for you biostatistics students? If you tell strangers that you are studying biostatistics, some will actually respond that they feel bad for you. As a result, you might eventually just start telling people that you are studying statistics, just because it’s easier.

When in doubt, however, both statistics and biostatistics students can tell this joke: “Have you ever heard the joke about the unemployed statistician? (No.) That’s because their isn’t one. (L.O.L.)”

It’s funny because every statistician has a job.

On that note, we hope you enjoyed reading our series. If you have any other tips, please feel free to comment below or send us an email.

Cheers!

Reblogged this on Stats in the Wild and commented:

Last part of our 4 part series about statistics and graduate school.

Sooo as a current PhD student in Biostats do you have any recommendations for learning the data science skills? I can handle missing data like a pro (since that’s my dissertation topic) but I have zero experience with SQL.

Thanks for reading, Kathy!

I have a soft spot for missing data. Most of what I learned was by trial, error, and googling. Of course, what I learned was mostly how to manipulate the data and eventually upload into R. I know the Hopkins Data Science course specifically covers SQL, and its free, so if I had to do it all over again, I’d try one of those.

Hope that helps!

First off, absolutely love this sequence of posts. Thanks for writing them. Second, I hope that curricula at major biostat/stat places will adapt. We definitely have at Hopkins. This is our methods sequence for PhD students (that I taught) that includes all sorts of data hacky stuff: https://github.com/jtleek/jhsph753and4/tree/master/lectures.