Joy in Research

By Charles Sutton on January 1, 2023

It’s hard to remember what machine learning conferences were like in person, but I think that I liked them a lot better when the field was smaller. It was easier to meet new people, to talk about new ideas in small groups of talented people who were really excited by them. The culture of the field has changed from back then, in intellectual ways that you know already, but the social ways are worth reflecting on, too.

Romanticizing the past is a sorry hobby for old men. I will not do that. All research communities have a callous system at their core — I’ve written about the toxic forces that drive and derail research careers — that’s always been the case.

Maybe now, these forces have gotten stronger and worse: The field is so much larger, we don’t all know each other, and the rewards for success are greater than I would have thought possible twenty years ago. The incentive to be competitive rather than kind, to fight on social media rather than to learn from people you disagree with, to speak hype rather than forthright truth, is so hard to resist. The temptation, in a word, to career rather than to learn. And as loath as I am to admit it, I have changed, too. Probably in similar ways, and for similar reasons.

Of course I wouldn’t want to go back. The most exciting time to be in this field is right now. But maybe I would go back in time, if just to grab something and come back. What I would bring with me is joy. Joy in reading a paper with a cute new idea, and chatting about it with friends over lunch. The joy that would impel us to make up a stupid song or a silly skit about our research workshop, with the same spirit of play to work out a new kind of algorithm, which may seem silly at first.

Joy in learning, joy in sharing, and joy in trying things out.

Happy new year!

Tags: advice, new years

ICLR 2020: My first virtual conference

By Charles Sutton on May 10, 2020

(or: How I learned to stop worrying and love virtual conferencing.)

Many computer science conferences are becoming virtual this year, due to travel restrictions. I enjoy physical conferences, so I was a bit skeptical about virtual conferences, despite their necessity. Since ICLR 2020 was the first major machine learning conference to be virtual, this was a good chance to test my skepticism. Here’s a personal report about the parts of the conference that I had a chance to experience, how well they worked for me, and what I would do differently the next time I attend a virtual conference.

At the top, the bottom line: ICLR was better than I expected a virtual conference could be. Both the format and the code should be a model for other computer science conferences that are going virtual. Massive congratulations are in order for all of the organizers, including the chairs Sasha Rush, Shakir Mohamed, Dawn Song, Kyunghyun Cho, and Martha White. There’s still more work to be done to make virtual conferences great, because we haven’t yet figured out serendipity, or how to recreate the fluid social dynamics of a large meeting. But now we have a great starting point to experiment from.

What is it? Who was there? What happened?

The International Conference of Learning Representations (ICLR) is one of the largest machine learning research conference. The physical ICLR 2019 conference has 2700 attendees. This year, the conference accepted 687 research papers from 2594 submissions.

The organizers of ICLR put an incredible amount of effort at short notice to build custom software to support the virtual conference (their code is open source, see links at the end). The conference format featured prerecorded talks, live Q&A sessions both for invited speakers, and text chat rooms. There were other parts of the conference where I didn’t spend much time: I only spent a little time at the workshops, and I didn’t have a chance to attend the virtual vendor booths or official conference socials. ICLR 2020 was big, even bigger than the in-person machine learning conferences: over 5600 registrations, 1300+ speakers, one million page views, and 100k+ video watches.

What I loved about Virtual ICLR

Suprisingly, I really loved the poster sessions. Every accepted research paper was presented as a virtual poster. Every paper had a web page that collected together links to the paper, the reviews (this is ICLR — they use openreview), a five-minute video by the authors, a text chat room, and links to a Zoom videoconference room specifically for the paper.

Every day of the conference had five 2-hour poster sessions, spread out to be convenient for participants from different time zones. Of course, the videos were always available, but each paper was assigned to two different poster session at which the authors could attend the Zoom room and answer questions. Assigning each paper to two sessions was designed to increase the number of time zones that were conventient for each poster.

The five-minute videos were my favorite part of the conference. They were synchronzied to slides, so if the talk was covering background you already knew, or if you missed a point, you could click back and forth in the slides. (In person, this is harder to do, and depends on the social and language skills of both participants.) I found the videos were a great way to get an overview of many papers and decide where I wanted to focus my time. They were much more polished and a little more in depth than the elevator pitch that you get at a physical poster when you ask, “Could you tell me a bit about this work?”

The Zoom rooms themselves were sparsely attended. I hopped into the rooms for probably a dozen papers, and usually I was the only one there, apart from the authors. Perhaps twice there was one person already asking a question. For me, this was great, because then I could start a ncie conversation with the authors (some tips on that later). The downside is I have learned a lot from hearing questions that my colleagues have asked at posters — that didn’t happen here.

The text chat rooms for each paper were active. People would ask detailed technical questions and the authors would respond. Some of the discussions were fairly in depth.

One way that virtual posters are arguably better than physical: you could not see who else was looking at the poster. Indeed, even though the Zoom rooms were generally quiet, the posters got a fair amount of attention. The median poster had 200 unique views, and the most popular had over 1000. Machine learning is a trend-seeking community, for as long as I’ve been a part of it. At physical conferences, posters from famous authors and research labs become very crowded, so much so that they become short talks (a poster session of repeated loops, like in Westworld) rather than scientific discussions. Hiding poster popularity prevented the bad aspects of the rich-get-richer dynamics.

Each poster session had its own page that listed the title and author of each paper (but no institutions, again to prevent rich-get-richer) along with an automatically generated thumbnail. These were randomly ordered for fairness.

The invited talks worked well also. These were prerecorded longer talks with a live Q&A. If I remember right, the video of the talks were made available the day before the Q&A. This worked well because what you get out of an invited talk, you can get just as much from a recording, and the live Q&As were very active. In some cases, speakers stayed for over an hour answering questions over both audio and text chat.

The overall paper visualization was well done. This plotted all of the papers as points in 2D space, organized by topic. You could highlight papers by keyword, authoor, and so on. These 2D visualizations are often a mess (I even have a paper discussing that), but this one seemed to locate papers together sensibly. I took some pride in that one of my papers was an outlier at the edge of the display.

Virtual vs Physical?

For me, there are two points to coming to a conference:

  1. Networking. Meeting new people who I can learn from, and learning new things from old colleagues.
  2. Broadening my view. I can attend a lot of posters and talks to get a broad snapshot of subareas that I might not have time to follow in detail when arxiv preprints come out. The curation function that a conference serves, as problematic as it is, really helps me here.

These worked differently at a virtual conference. For broadening my view, the virtual conference was actually better, because the five minute videos were overall nicer than the quick explanation that you might get at a physical poster.

For networking, as you would expect, my experience was more mixed. On the good side, it was easier to meet the poster presenters, because others were Zoom-shy. This was great for meeting PhD students. In a lot of ways, though, I found networking harder at a virtual conference. Here is what was harder to make happen:

  • It was hard to connect with old colleagues, because I didn’t know who was participating in the virtual event when.
  • I had fewer serendipitous meetings. I can’t count the number of research meetings that have started impromptu at local coffeeshops near the conference. Sometimes I will start talking to someone I don’t know well, only to find that there is an unexpected connection between our different research interests. This can result in collaborations, invited talks, grant proposals…
  • Networking cascades. When you meet one person, they might introduce you to others. This can happen naturally, like when you are talking to someone and one of their colleagues invites them to lunch. This is harder in a virtual format.

Advice for participants: How to virtual-conference with aplomb

Attending a virtual conference is a different skill from attending a physical one. Here are some things that I will try to do better next time: For participants:

  • As much as you can, clear your calendar. Treat the virtual conference as reserved time. You say, “That’s hard for me.” I know. It was hard for me too. But if you can reserve time to attend a physcial conference, you can do it for a virtual conference. Really, no one will stop you.
  • Preparation helps. For physical conferences, I always have the best intentions to read lots of papers on the plane, but it doesn’t always happen. For virtual conferences, I think that preparation is more important because the face-to-face time (Zoom rooms etc) is even more limited. I wish that I had watched more of the five minute videos in advance of the poster sessions. I think it’s also good to be proactive about setting up individual meetings before the conference.
  • How not to be shy during poster sessions. I have the impressions that many attendees were reluctant to participate in the poster sessions because it felt a bit socially awkward, and hard to have a good question.

I have some advice for asking virtual poster questions (this is an advice blog, after all). First, embrace the awkwardness. Yeah, it is a bit awkward at first, but once you start talking, it’s just another technical conversation like many you’ve had before. Second, yeah, you do need a question, but it doesn’t need to be brilliant, and you don’t need to have read the paper. “I watched the video, but I didn’t understand how the Thagomizer works. Could you tell me a bit more about that?” “Is your regularization scheme related to curriculum learning?” “I’m really interested in your work, because my recent work is about machine learning for baking sourdough. Do you think that your work could be applied to that?” It’s that easy to start a conversation!

What more could we do next time?

I take away two big positive lessons from virtual ICLR: poster sessions can work virtually, and text chat helps a lot. How can we do even better? How do we get more serendipity and networking in a virtual conferences?

To support social networking, there is lots of room to experiment with using videoconferencing creatively. Perhaps we could see lists of who is around the virtual conference at any one time (allowing people to explore anonymously if they prefer.) Perhaps we can have low friction ways to create a one-off videoconferencing rooms, which others can see and join. Maybe we could consider having “virtual receptions”: a one-hour block of time where many people are present to have discussions, and people can show up and self-organize videoconference discussions of even 3 or 4 people. I sometimes joke that perhaps we could virtually replicate running into someone in the hallway: 10% of the time, when you attempt to join the videoconference room for a poster, you are redirected to a videoconference with a random participant instead. That’s a bit silly, but random chat groups, perhaps in larger groups, and with safegards to prevent abuse, might feasible. Perhaps we will find creative ways to use social media during a virtual conference.

A great example of creative videoconferencing was ICLRTown: a 2D virtul world cum videoconferencing system. You moved a little 2D avatar around an 80s style graphic of a conference center. When you get close enough to someone else’s sprite, you see their video and can talk to them. We got chats of up to 9 people that way. This was fun to play with, but it still seems a work in progress.

Finally, I wonder to what extent the huge scale of ICLR actually helped it to be a virtual conference, making it easier to spread across timezones, or easier to get a critical mass for live Q&A. Most computer sciences conferences are smaller than ICLR, so different models might work well for them.

Resources about ICLR and virtual conferences

If this post wasn’t enough, here is more you can read if you find yourself in the exciting but scary position of organizing a virtual conference:

Tags: conferences

How do you break into a career in machine learning?

By Charles Sutton on February 1, 2020

I got a question the other day about how to start a career in machine learning. I gave the best answer that I could, but I’m not sure that my best was very good. Can you help? If so, join the discussion on social media (or send me a note privately):

The question was:

I am currently studying for a master’s at [a good university outside the United States -cas] while working part-time as an NLP research engineer. I would like to ask you for some advice if that was possible.

My question is: without having outstanding grades or publications in top AI journals, how could I find my way towards a top Ph.D. program or at least research internship, is there any possibility? I am currently working on deep learning (paid job) and have some Ph.D. offers. Still, I feel that internships at companies like Google or Ph.D. positions at top research centers are impossible without previous experience in a similar place, which is like a snake biting its tail. Except for students with massive GPA scores, which is not my case.

I am happy with my current job, but so far, I have just been able to grasp the opportunities that I found, so I am thinking about trying to go abroad. Everything I have found is very applied, and I would like to study more abstract or generic (even exotic) topics, instead of applying existing neural architectures to specific domains.

I wrote:

Lots and lots of applicants to computer science PhD programs want to do machine learning, so admissions is very competitive. I don’t think it’s necessary to have publications to get into a PhD program, although it does help, and the higher you go in the rankings, the more that you need any help that you can get.

I’m not sure that I have better advice than to learn as much as you can, do good work, network with others, and work your way up the prestige ladder. It’s true that going to a very highly ranked school gives you an advantage, but I know very good researchers who did not have very good grades in undergrad, and even if you do your PhD at a lower ranked place, if your work is good, it still can stand out.

And now, a question for my readers (all three of them), what do you think?

The three PhD Comic strips that are actually good research advice

By Charles Sutton on February 18, 2019

If you’re reading this blog, then you already know about PhD Comics. If you really haven’t seen them before, click the link and read them now. They are more insightful and funnier than anything in this blog.

It goes without saying, however, that you should not model your own career on the characters in the PhD comics strip. For one thing, they’ve been in grad school for more than 20 years.

Amazingly, though, there are three PhD comic strips, and probably only three, that are actually good research advice:

  • Writing your thesis outline. A thesis is daunting. How do you write an entire book over five-plus years? Instead, I like to tell my students to think and plan at the level of individual papers. Basically, you have three content chapters of your thesis, and so if you have three strong papers that fit together thematically, then you set up one paper per each chapter, and there you are! No sweat. I call this the “PhD Comics Guide to Writing Your Thesis.”

  • Amount of time writing one email. I saw this comic when I was a junior professor, and I immediately realized: (a) this is so true, and (b) I needed to act more like the professor in the comic strip. This is how I learned that when you have many things to decide, you must decide quickly.

  • The evolution of intellectual freedom. Sometimes you have to take big risks in your work and follow your own star. Once you learn the basic technical skills needed for research, it is so easy to do only incremental work, follow what the cool people are doing, and focus on what’s likely to get you jobs and funding. There are good reasons to do some of this, but if this is all that you do, then why are you in research?

Know thyself. The New Year's resolution that underlies all productivity advice

By Charles Sutton on January 5, 2019

I’m a sucker for New Year’s resolutions. Every year I make up a half dozen resolutions, usually the same ones each year, and carefully track my progress for at least two or three months before I get busy and forget all about them. And in all seriousness, I’m happy about this, because sometimes, for maybe one resolution in four, I’m still able to make a lasting change in my habits. That’s more than enough to justify the effort, as long as I take the failures in good humor.

You don’t have to be as silly about resolutions as I am, or even to have any resolutions at all, but the underlying principle is important for any creative work. You could say that it’s the underlying principle behind all of the advice in this blog.

You need to know yourself, understand the way you think, adapt the way you work to the way you think, and always keep looking for ways to work better.

You will have all kinds of little preferences about when you are most alert, creative, and productive. Maybe you like to work in the morning. Maybe you like to have a bit of background noise, like in a coffee shop. Maybe you need almost absolute quiet. Maybe you like to work from home, or maybe you prefer the structure of having an office, where your work space is separate from home. Maybe you like to pace around the office, talking to yourself and gesticulating wildly. Or maybe that’s just me. Ahem.

Whatever it is, you need to learn what makes you think most effectively, and seek out that environment. No one can do that for you. Your best work space will be different for you than it is for me. (And good thing, too, otherwise all of Google would be people bumping into each other in the hallways because they were too busy talking to themselves.) The only way to know is to experiment and find out what works for you.

And it’s also vital for us to keep experimenting, no matter how senior we are in our careers. One reason is (I think this is from David Allen), “The better you get, the better you’d better get.” As you become more accomplished, you gain a reputation which means that more demands are placed on you. Another reason is that no matter how good you are, you haven’t learned all of the tricks. Your mental rhythms change as you get older, just as an athlete in his thirties trains differently than a teenager. Finally, the creative challenges change as you get later in your career, as you need to learn to adapt to the way that the field has changed in twenty years.

What’s always appealed to me about the research career is that you never stop learning. This is as much true for how you set up the environment of your work as it is for the content of your work.