Research Philosophy

Some machine learning philosophy

The bitter lesson by Rich Sutton

Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain, but the only thing that matters in the long run is the leveraging of computation.

ICLR2019 Keynote by Leon Bottou

  • The statistical problem is only a proxy.
  • Nature does not shuffle the examples. We shouldn’t.
  • Invariance across environments buys extrapolation powers.
  • Invariance across environments is related to causation.
  • Invariant representations enable invariance.
  • Realizable problems bring different challenges.



A research guide from the greatness

Idea is cheap. Implementation, experimentation, and explanation is not.

Richard Hamming

  • Presentation video and Transcripts (
  • You’ve got to work on important problems. MY: Well, if you do not work on important problem, how do you produce important work.
  • In summary, I claim that some of the reasons why so many people who have greatness within their grasp don’t succeed are: they don’t work on important problems, they don’t become emotionally involved, they don’t try and change what is difficult to some other situation which is easily done but is still important, and they keep giving themselves alibis why they don’t. They keep saying that it is a matter of luck. I’ve told you how easy it is; furthermore I’ve told you how to reform. Therefore, go forth and become great scientists!
  • The person works on the right problem at the right time in the right way is what counts and nothing else.
  • Always think what’s the important thing in my field.

Takeo Kanade

  • Presentation video
  • Convince. If it works, it is convincing. “If they ask how your method works, they are not yet convinced. If they were, they would ask how much is it.”
  • Do fast. “If you come up with a good idea, there are at least two more people who think the same.” “Who said it first is not important, who gets there first is.”
  • Think like an amateur, act like an expert.
Video Compression

Learning Binary Residual Representations for Domain-specific Video Streaming


This post briefly summarizes our work on using deep learning to improve video streaming quality for existing video compression standards. The work is published as an AAAI 2018 paper (joint work with Yi-Hsuan Tsai, Deqing Sun, Ming-Hsuan Yang, and Jan Kautz). Some results can be found in the accompany video.


Existing video compression standards (e.g., MPEG4, H.264, and HEVC) can effectively compress most of the information in a video. What are left uncompressed are the residual images, which are the difference between the compressed and original videos. The residual images are difficult to compress because they contain highly non-linear, domain-specific patterns. In this work, we ask the following two questions, hoping that they can lead us to better compress videos for video streaming.

  1. Whether one can improve existing video compression algorithms to achieve a better visual quality using the same bandwidth if one is willing to limit the use of the resulting algorithm in a domain-specific manner.
  2. Whether the improved algorithm can be seamlessly integrated to existing video compression standard.

Why 1? We ask the first question because many interesting video streaming services are domain-specific. For example, as using video game streaming services (e.g., NVIDIA GeForce Now), the game videos are first rendered in the GPU server. They are then compressed and delivered to the end user. Over a period of hours, the videos to be streamed are all in the same domain.

Why 2? We ask for a seamless integration because we want to leverage existing video compression infrastructure, including those hardware-optimized computation and well-established software stacks.


We believe that we have come out with a hybrid system to address the two questions mentioned above. In our hybrid system, we first apply an existing video compression algorithm (e.g., H.264) to compress domain-specific videos and train a binary autoencoder (the latent representations are either 0 or 1) to encode the resulting residual images into binary representations in frame-by-frame. We then apply Huffman coding to losslessly compress the binary representations. The compressed binary representations can be sent to the user in the meta data field in the existing video streaming network packet. This way, our system is compatible the existing video streaming standard. We illustrate our system design in the following figure.



This allows us to achieve a better visual quality using the same channel bandwidth. Specifically, we can reduce the bandwidth assigned to the existing video compression algorithm and use the saved bandwidth for transmitting the binary residual representation computed by the binary autoencoder. Our experiment results show that by this way, we can improve the visual quality up to 1.7 db using this hybrid system. The results for the SkyRim game is show in the following figure (click and zoom in for a better view). For more details about the algorithm and experiment results, please check out our paper.